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METHODS AND COMPOSITIONS FOR THE CHROMOSOMAL INTEGRATION OP HETEROLOGOUS 
SEQUENCES 

Related Information 

5 The contents of the patents, patent applications, and references cited throughout 

this specification are hereby incorporated by reference in their entireties. 

Government Sponsored Research 

This work was supported, in part, by grants from the Florida Agricultural 
1 0 Experiment Station (publication number R-06853) U.S. Department of Agriculture, 
National Research Initiative (98-35504-6177 & 98-35505-6976), and the U.S. 
Department of Energy, Office of Basic Energy Science (DE-FG02-96ER20222). 

Background of the Invention 

1 5 Plasmid vectors are versatile tools which facilitate the isolation, expression and 

analysis of genes (Bolivar, F., et al 9 Gene 2: 95-1 13 (1977)). Useful characteristics 
include the facile production of identical DNA for subsequent in vitro and in vivo 
manipulation, the presence of multiple cloning sites (MCS), selectable markers which 
allow rapid screening for new or improved traits, and the ease with which they can be 

20 established as multiple cellular copies to alter gene expression in recombinant hosts. 
However, the physiological burden imposed by multiples copies of plasmid genes, 
potential for internal re-arrangements, and segregational instability are disadvantages for 
many biotechnological applications (Peredelchuk, M. Y., Gene 187: 231-238 (1997)). 
Antibiotic-resistance genes are frequently used for plasmid maintenance. 

25 Alternative selectable markers based on metabolic deficiencies of the host (Degryse, E., 
J. Biotech. 18: 29-40 (1991)) pose further complications for improvement cycles in 
production strains. For applications such as the deliberate field release, development of 
organisms for use in food products, and development of biocatalysts for bulk chemicals, 
special requirements for plasmid maintenance are undesirable. 

30 Many of the problems associated with plasmids can be eliminated by the 

chromosomal integration of desired traits. Integration tools based on modified 
transposons and conditional plasmid replicons have been developed (de Lorenzo, V., et 
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al, 1 Bacteriol 172: 6568-6572 (1990); de Lorenzo, V., et ah, Methods Enzymol. 235: 
386-405 (1994); Rode, C. K., et al> Gene, 166: 1-9 (1995); Hamilton, C. H., et ai 9 J. 
Bacteriol 171: 4617-4622 (1989); Kaniga, K., etai, J. Bacteriol 109: 137-141 (1991); 
Le Borgne, S., etai, Gene 223: 213-219 (1998); Link, A.J., etal t 1 Bacteriol 179: 
5 6228-6237 (1997)). With these tools, integration can be random or precisely directed by 
DNA fragments homologous to the host genome. However, complications still remain 
with most integration systems such as the persistence of selectable markers, transposons, 
or replicons. For strains in which multiple alterations or continuing improvements are 
desired, the accumulation of markers and delivery systems can be troublesome. 

1 0 Selectable events may be limited by the availability of functional markers. Integrated 
DNA (replicons, transposon genes, and selectable marker genes) can serve as a site for 
homologous recombination events which interfere with targeting or randomness during 
subsequent constructions. Also, the persistence of replicons and transposons increase 
the potential for gene transfer to other organisms in the environment. 

15 Replicons and transposons can be eliminated by transforming with purified DNA 

fragments which lack replication functions ( Hasan, N., et al t Gene 150: 51-56 (1994); 
Ohta, K., et al. 9 Appl. Environ. Microbiol, 57: 893-900 (1991)). Non-antibiotic markers 
are available but are often less efficient than antibiotics (de Lorenzo, V., et al, J. 
Bacteriol 172:6568-6572(1990); de Lorenzo, V., et aL, Methods Enzymol 235: 386- 

20 405 (1994); Herrero, M, et al 9 J. Bacteriol 172: 6557-6567 (1990)). In a few cases, 
loss of functions such as tetracycline-sensitivity and absence of sucrose-saci? system 
can be selected directly (Bochner, B. R., etai J. Bacteriol 143: 926-933 (1980); 
Kaniga, K., et al, 1 Bacteriol 109: 137-141 (1991); Ried, J.L., et al, Gene 57: 239-246 
(1985)). However, loss of function due to a mutation is typically not a precise event and 

25 can result from unstable point mutations, partial deletion of the resistance gene, or 
extended deletions which impair the host. 

Summary of the Invention 

The foregoing limitations are overcome using the method and vectors of the 
30 present invention. In particular, the invention provides a method for integrating nucleic 
acids into a genome in such a way that any unwanted vector or selectable marker DNA 
can be removed. This allows for the genome of the recipient host cell to be made 
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substantially free of any unnecessary nucleic acid, e.g., vector sequence, marker 
sequence, that can lead to genomic instability. 

Accordingly, in one aspect, the invention provides a method for integrating a 
nucleic acid construct into the genome of a host cell by contacting the cell with a nucleic 
5 acid construct under conditions such that the nucleic acid construct is integrated by the 
cell. The method uses a nucleic acid construct that contains a passenger sequence and a 
marker sequence, where the marker sequence is flanked by a first and second 
recombining site. In one embodiment, the first and second recombining site are oriented 
in the same direction. 

10 In another embodiment, the method employs a construct that further contains an 

origin of replication between the first and second recombining sites, e.g., a conditional 
origin of replication (i.e., replicon) preferably, e.g., pSClOlori, R6K-yori, colEl, oriEV, 
or an origin of replication derived therefrom. 

In another embodiment, the nucleic acid construct of the above aspect contains a 

15 sequence that contains a promoter, a restriction site, an intron, an exon, an IRES 
element, a polyadenylation site, or a combination thereof. 

In yet another related embodiment, the nucleic acid construct of the above aspect 
contains a guide sequence capable of directing site-specific integration of the nucleic 
acid construct to a specific site in the sequence of a replicating genome. 

20 In even another embodiment, the above method involves exposing the targeted 

cell to a site-specific recombinase that results in recombination between the first and 
second recombining sites of the foregoing nucleic acid construct such that the 
intervening sequence flanked by the recombining sites is excised. The method involves 
using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase. 

25 Accordingly, in a related embodiment, the corresponding recombining sites are dif, att, 
loxP, and preferably FRT. In a preferred embodiment, the above method can be 
repeated such that the sequential introduction of more than one genetic element may be 
introduced. 

In still another embodiment, the passenger sequence encodes at least one gene, 
30 preferably a gene involved in ethanologenesis, such as, for example, adh or pdc. In a 
related embodiment, the gene involved in ethanologenesis may be derived from a 
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prokaryote or a eukaryote. In another related embodiment, the passenger sequence may 
further contain a promoter 5' to the passenger sequence. 

In another embodiment, the above method employs a nucleic acid construct 
containing a marker sequence that encodes a selectable gene, e.g., an antibiotic 
5 resistance gene or a non-antibiotic resistance gene. In a preferred embodiment, the 
antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, 
kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or 
chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic 
resistance gene is an auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene 
10 encoding a green fluorescent protein. 

In a preferred embodiment of the first aspect, the method employs a bacterial 
cell, preferably a Gram-negative bacterial cell, more preferably a facultatively anaerobic 
bacterial cell, more preferably a bacterial cell selected from the family 
Enterobacteriaceae, and most preferably, a bacterial cell of the genus Klebsiella or 
1 5 Escherichia {e.g., E. coli B or E. coli K12). In a related embodiment, the host cell is a 
recombinant bacterial cell. In another related embodiment, the method uses a nucleic 
acid construct that is, or is derived from, a plasmid selected from the group consisting of 
pLOI2223, pLOI2224, pLOI2225, pLOI2226, pLOI2227, pLOI2228, and pLOI2403. 

In a second aspect, the invention provides a method for producing a recombinant 
20 ethanologenic cell by contacting a cell with a nucleic acid construct under conditions in 
which integration of the nucleic acid construct occurs resulting in the formation of a 
recombinant ethanologenic cell. The method uses a nucleic acid construct that contains 
a passenger sequence that contains an ethanologenic gene, and a marker sequence, 
flanked by a first and second recombining site. In one embodiment, the first and second 
25 recombining site are oriented in the same direction. 

In another embodiment, the passenger sequence encodes an ethanologenic gene 
such as a dh or pdc. In a related embodiment, the passenger sequence encodes pdc, 
adhB, and cat. 

In another embodiment, the nucleic acid further contains a guide sequence 
30 thereby resulting in site-specific integration of the nucleic acid construct. In a related 
embodiment, the guide sequence is derived from a replicating genome. 
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In yet another embodiment, the method employs a construct that further contains 
an origin of replication between the first and second recombining sites, e.g., a 
conditional origin of replication {i.e., replicon) preferably, e.g., pSClOlori, R6K-yori, 
colEl, oriEV, or an origin of replication derived therefrom. 
5 In even another embodiment, the above method involves exposing the targeted 

cell with a site-specific recombinase that results in recombination between the first and 
second recombining sites of the foregoing nucleic acid construct such that the 
intervening sequence flanked by the recombining sites is excised. The method involves 
using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase. 

1 0 Accordingly, in a related embodiment, the corresponding recombining sites are dif, att, 
loxP, and preferably FRT. 

In still another embodiment, the above method employs a nucleic acid construct 
containing a marker sequence that encodes a selectable gene, e.g., an antibiotic 
resistance gene or a non-antibiotic resistance gene. In a preferred embodiment, the 

1 5 antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, 
kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or 
chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic 
resistance gene is an auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene 
encoding a green fluorescent protein. 

20 In a preferred embodiment of the foregoing aspect, a recombinant ethanologenic 

cell is produced according to the above method. 

In a third aspect, the invention provides a recombinant host cell having a nucleic 
acid construct that contains a passenger sequence and a marker sequence, flanked by a 
first and second recombining site. In one embodiment, the first and second recombining 

25 site are oriented in the same direction. 

In a related embodiment, the host cell contains a nucleic acid construct where the 
passenger sequence encodes a gene involved in ethanologenesis such as, e.g., adh or 
pdc. In a related embodiment, the passenger sequence is selected from the group 
including adh pdc, and cat. 
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In one embodiment, the host cell contains a nucleic acid construct that further 
contains an origin of replication between the first and second recombining sites, e.g., a 
conditional origin of replication {i.e., replicon) preferably, e.g., pSClOlori, R6K-yori, 
colEl , oriEV, or a an origin of replication derived therefrom. 
5 In another embodiment, the host cell contains a nucleic acid construct containing 

a guide sequence capable of directing site-specific integration of the nucleic acid 
construct to a specific site in the sequence of a replicating genome. 

In yet another embodiment, the host cell contains a nucleic acid construct 
containing a marker sequence that encodes a selectable gene, e.g., an antibiotic 

10 resistance gene or a non-antibiotic resistance gene. In a preferred embodiment, the 
antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, 
kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or 
chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic 
resistance gene is an auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene 

1 5 encoding a green fluorescent protein. 

In another embodiment, the host cell is exposed to a site-specific recombinase 
that results in recombination between the first and second recombining sites of the 
foregoing nucleic acid construct such that the intervening sequence flanked by the 
recombining sites, is excised. In a preferred embodiment, the recombination involves 

20 using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase. 

Accordingly, in a related embodiment, the corresponding recombining sites are dif, att, 
loxP, and preferably FRT. 

In a preferred embodiment, the host cell is a bacterial cell, e.g., a recombinant 
cell and/or an ethanologenic cell, preferably a Gram-negative bacterial cell, more 

25 preferably a facultatively anaerobic bacterial cell, more preferably selected from the 
family Enterobacteriaceae, and most preferably, of the genus Klebsiella or Escherichia 
(e.g.,E. coli B or E. coli K12). In a related embodiment, the host cell contains, a nucleic 
acid construct that is, or is derived from, a plasmid selected from the group consisting of 
pLOI2223, pLOI2224, pLOI2225, pLOI2226, pLOI2227, pLOI2228, and pLOI2403. 

30 In a fourth aspect, the invention provides a method for producing ethanol by 

providing a recombinant ethanologenic cell with a nucleic acid construct that contains a 
passenger sequence and a marker sequence, where the marker sequence is flanked by a 



WO 01/18222 



PCT/US00/22700 



first and second recombining site. The method includes contacting the cell with a 
substrate which can be fermented into ethanol, such that expression of the passenger 
sequence results in the production of ethanol. 

In one embodiment, the first and second recombining site are oriented in the 
5 same direction. 

In another embodiment, the passenger sequence encodes a gene involved in 
ethanologenesis, such as, e.g., adh or pdc. In a related embodiment, the passenger 
sequence encodes adhB, pdc, and cat. 

In another embodiment, the method employs a construct that further contains an 
10 origin of replication between the first and second recombining sites, e.g., a conditional 
origin of replication {i.e., replicon) preferably, e.g., pSClOlori, R6K-yori, colEl, oriEV, 
or a an origin of replication derived therefrom. 

In yet another embodiment, the nucleic acid construct of the above aspect 
contains a guide sequence capable of directing site-specific integration of the nucleic 
1 5 acid construct to a specific site in the sequence of a replicating genome. 

In yet another embodiment, the above method involves exposing the targeted cell 
with a site-specific recombinase that results in recombination between the first and 
second recombining sites of the foregoing nucleic acid construct such that the 
intervening sequence flanked by the recombining sites is excised. The method involves 
20 using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase. 

Accordingly, in a related embodiment, the corresponding recombining sites are dif, att, 
loxP, and preferably FRT. 

In still another embodiment, the above method employs a nucleic acid construct 
containing a marker sequence that encodes a selectable gene, e.g., an antibiotic 
25 resistance gene or a non-antibiotic resistance gene. In a preferred embodiment, the 
antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, 
kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or 
chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic 
resistance gene is an auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene 
30 encoding a green fluorescent protein. 
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In a preferred embodiment, the method employs a host cell that a bacterial cell, 
e.g., a recombinant bacterial cell, preferably a Gram-negative bacterial cell, more 
preferably a facultatively anaerobic bacterial cell, more preferably selected from the 
family Enterobacteriaceae, and most preferably, either Klebsiella or Escherichia (e.g., E. 
5 coli&orE. coli K\2). 

In one embodiment, the method uses a nucleic acid construct that is, or is derived 
from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, pLOI2225, 
pLOI2226, pLOI2227, pLOI2228, and pLOI2403. 

In a fifth aspect, the invention provides a nucleic acid construct containing a 
10 passenger sequence, and a marker sequence, where the marker sequence is flanked by a 
first and second recombining site. In a related embodiment the first and second 
recombining site may be oriented in the same direction. 

In another embodiment, the construct further contains an origin of replication 
between the first and second recombining sites, e.g., a conditional origin of replication 
15 (i.e., replicon) preferably, e.g., pSClOlori, R6K-yori, colEl, oriEV, or a an origin of 
replication derived therefrom. 

In another embodiment, the nucleic acid construct contains a guide sequence 
derived from a replicating genome. In a related embodiment, the guide sequence is 
derived from a bacterial cell. 
20 In yet another embodiment, the construct further contains at least one unique 

restriction enzyme site. In yet another related embodiment, the passenger sequence of 
the construct contains an ethanologenic gene, such as, e.g., adh or pdc, and preferably 
adhB, pdc, cat or a combination thereof. 

In still another related embodiment, the passenger sequence of the construct 
25 contains a sequence selected from the group consisting of a heterologous promoter and a 
prokaryotic termination sequence. 

In even another embodiment, the nucleic acid construct contains a marker 
sequence that encodes a selectable gene, e.g., an antibiotic resistance gene or a non- 
antibiotic resistance gene. In a preferred embodiment, the antibiotic resistance gene is 
30 the gentamycin resistance gene, zeocin resistance gene, kanamycin resistance gene, 
ampicillin resistance gene, tetracycline resistance gene, or chloramphenicol resistance 
gene. In another preferred embodiment, the non-antibiotic resistance gene is an 
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auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene encoding a green 
fluorescent protein. 

In another embodiment, the construct is exposed to a site-specific recombinase 
that results in recombination between the first and second recombining sites of the 
5 nucleic acid construct such that the intervening sequence flanked by the recombining 
sites is excised. The excision involves using a recombinase such as Xer, Int, Cre, and 
preferably, FLP recombinase. Accordingly, in a related embodiment, the construct 
contains recombining sites such as dif, att, loxP, and preferably FRT. 

In a preferred embodiment of the above aspect, the construct is, or is derived 
10 from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, pLOI2225, 
pLOI2226, pLOI2227, pLOI2228, and pLOI2403. 

In a related embodiment, the invention provides a kit including at least one of the 
foregoing nucleic acid constructs and instructions for use. 

1 5 Other features and advantages of the invention will be apparent from the 

following detailed description and claims. 

Brief Description of the Drawings 

Figure 1 shows a schematic of various integration vectors and helper plasmids 
20 (see text for details). 

Figure 2 shows a schematic illustrating the use of an integration vector for the 
insertion of heterologous "passenger" genes into a host genome. A "guide" sequence 
allows for the site specific targeting of the passenger sequence via homologous 
recombination. A helper plasmid provides a recombinase (FLP) that catalyzes the in 
25 vivo excision of any unnecessary sequence (e.g. , the replicon and marker sequence) 
flanked by recombining sites (FRT) leaving resident in the genome the passenger gene. 

Figure 3 shows a diagram illustrating the targeting of heterologous "passenger" 
genes to a site specific region of a genome (i.e., the adhE gene (shaded)) and 
recombinase-mediated deletion of the replicon and marker used during plasmid 
30 construction and initial integration. The alignment of the guide sequence of the 

targeting vector and the genomic integration site (Panel A), cross-over event (Panel B), 
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and resultant recombinase-mediated (FLP) excision of the replicon and selectable 
marker (Panel C) are shown. 

Figure 4 is a photograph of an agarose gel stained with ethidium bromide 
showing PCR amplified nucleic acid fragments amplified using different primers (Panel 
5 A) and restriction enzyme digested (Panel B) to confirm the correct chromosomal 
integration of a heterologous nucleic acid (see text for details). 

Detailed Description of the Invention 

In order for the full scope of the invention to be clearly understood, the following 
10 definitions are provided. 



/. Definitions 

As used herein the term "host cell" and "recombinant host cell" is intended to 
include a cell suitable for genetic manipulation, e.g., which can incorporate heterologous 

1 5 polynucleotide sequences, e.g. , which can be transfected. The cell can be a 

microorganism or a higher eukaryotic cell, such as an animal cell or a plant cell. The 
term is intended to include progeny of the cell originally transfected. In preferred 
embodiments, the cell is a bacterial cell, e.g., a Gram-negative bacterial cell, and this 
term is intended to include all facultatively anaerobic Gram-negative cells of the family 

20 Enterobacteriaceae such as Escherichia, Shigella, Citrobacter, Salmonella, Klebsiella, 
Enterobacter, Erwinia, Kluyvera, Serratia, Cedecea, Morganella, Hafnia, Edwardsiella, 
Providencia, Proteus, and Yersinia, Particularly preferred recombinant hosts are 
Escherichia coli or Klebsiella oxytoca cells. Preferably, the term recombinant host cell 
is intended to include a cell that has already been selected or engineered to have certain 

25 desirable properties and suitable for further modification using the compositions and 
methods of the invention. 

The term "passenger sequence" is intended to include any desired sequence that 
is intended for integration into the host cell. For example, the passenger sequence may 
include, a restriction site, multiple restriction sites {e.g., a polylinker or multiple cloning 

30 site (MCS)), a unique stretch of sequence suitable for marking the cell as distinct from a 
wild type cell, a regulatory element, e.g., a polyadenylation sequence, a promoter, an 
intron, a splice signal, an internal ribosomal entry site (IRES); for regulating the 
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expression of multiple genes or cistrons, and/or a gene, for example a gene encoding a 
polypeptide. A passenger sequence encoding a gene may further include a promoter if 
appropriate or use the endogenous promoter found proximal to the site of integration. In 
addition, the passenger sequence encoding a gene may further include other regulatory 
5 elements, if appropriate, to improve expression. A passenger sequence may comprise 
genetic elements derived from any source, e.g., eukaryotes, prokaryotes, virii, or 
synthetic polynucleotide fragments. 

The term "guide sequence" is intended to include a sequence that can be located 
5\ 3', or internal to the sequence intended to be integrated, such that recombination 
10 between the introduced vector (or portion thereof) and the recipient host cell genome is 
accomplished. Typically, a guide sequence has high similarity or identity with a site 
specific region of the recipient host cell genome such that a targeted integration of the 
passenger sequence to this site by homologous recombination is accomplished. 

The term "site-specific integration" is meant to refer to the integration of an 
1 5 exogenous nucleic acid sequence to a specific area of the genome of a recipient host cell. 
In general, a guide sequence allows for the homologous recombination between a 
portion of the targeting vector and the host cell such that a predictable targeting of the 
sequence to that area of the genome represented in part by the guide sequence, is 
specifically targeted. 

20 The term "marker sequence" is intended to include any sequence that can be 

encoded by a nucleic acid, introduced into the replicating genome of a recipient host 
cell, and detected, thereby indicating that the cell has been "marked" by such a 
sequence. Accordingly, the term is intended to include a sequence for example having a 
restriction enzyme sequence that can be detected by a corresponding restriction enzyme. 

25 In addition, or alternatively, the sequence may be detected using any of a variety of art 
recognized techniques, e.g., polymerase chain reaction using appropriate primers, 
nucleic acid hybridization, etc. Preferably, the marker sequence encodes a gene product 
that confers on the cell a selectable phenotype, e.g., resistance to an antibiotic or other 
cytotoxic agent or a conditional growth advantage. Accordingly, a marker sequence can 

30 encode, e.g., the gentamycin resistance gene, zeocin resistance gene, kanamycin 
resistance gene, ampicillin resistance gene, tetracycline resistance gene, 
chloramphenicol resistance gene, hygromycin resistance gene, thymidine kinase, an 
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auxotrophic gene, a metal ion resistance gene, a trp gene, a gene producing a visual 
phenotype (e.g., green fluorescent protein), or a gene producing a cell surface antigen 
(e.g., CD4). Any of the foregoing markers when expressed in cells can be detected 
using art recognized techniques such as, e.g. , appropriate culture conditions or selection 
5 techniques (e.g., antibodies, FACS (fluoroscein activated cell sorting), or flow 
cytometric analysis). 

The term "origin of replication" or "replicon" is intended to include any 
sequence, conditional or otherwise, that can confer on a nucleic acid sequence the ability 
to be replicated in a cell for the purposes of, e.g., propagating the nucleic acid or 
10 maintaining the presence of the nucleic acid in a cell. Such sequences are known in the 
art and are routinely incorporated into the backbone of many nucleic acid vectors to 
facilitate their propagation and typically include, e.g., pSClOlori, R6K-yori, colElori, or 
oriEV. 

The term "recombining site" is intended to include a nucleic acid element that 
1 5 represents a site specific binding site for a recombinase. Examples of such elements 
include, e.g., the FRT sequence, the dif sequence, the att sequence, and the loxP 
sequence. 

The term "recombinase" is intended to include any recombinase having a site 
specific recombinase activity. Examples of such recombinases are FLP, Xer, Int, and 

20 Cre and these enzymes typically have site specific recombinase activity on, respectively, 
the FRT sequence, the dif sequence, the att sequence and the loxP sequence. 

The term "gene involved in ethanologenesis" is intended to include any gene 
capable of conferring on a cell ethanologenic properties or capable of improving any 
aspect of cellular ethanologenesis, such as, e.g., substrate uptake, substrate processing, 

25 ethanol tolerance, etc. Genes involved in ethanologenesis are, e.g., alcohol 

dehydrogenase, pyruvate decarboxylase, secretory protein/s, and polysaccharases, and 
these genes, or their homologs, may be derived from any appropriate organism. 

The term "substrate" is intended to include any moiety that can be converted into 
ethanol. Substrates that are suitable for converting into ethanol include sugar moieties 

30 such as, e.g., monosaccharides, disaccharides, trisaccharides, oligosaccharides, and 
complex carbohydrates, including, e.g., lignocellulose, which comprises cellulose, 
hemicellulose, and pectin. 
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The term "derived from" is intended to include the isolation (in whole or in part) 
of a polynucleotide segment from an indicated source. The term is intended to include, 
for example, direct cloning, PCR amplification, or artificial synthesis from, or based on, 
a sequence associated with the indicated polynucleotide source. 
5 The term "ethanologenic" is intended to include the ability of a microorganism to 

produce ethanol from a carbohydrate as a primary fermentation product. The term is 
intended to include naturally occurring ethanologenic organisms, ethanologenic 
organisms with naturally occurring or induced mutations, and ethanologenic organisms 
which have been genetically modified. 
10 The term "Gram-negative bacteria" is intended to include the art recognized 

definition of this term. Typically, Gram-negative bacteria include, for example, the 
family Enterobacteriaceae which comprises, among others, the species Escherichia and 
Klebsiella. 

15 //. Nucleic Acid Constructs 

The invention provides a number of novel nucleic acid constructs that are 
suitable for targeting a heterologous nucleic acid sequence to the genome or 
chromosome of a recipient host cell. 

The heterologous sequence or passenger sequence of the targeting vector can be 

20 any desirable sequence that is useful when introduced into a cell. Accordingly, the 

sequence may be a regulatory element {e.g., a promoter, intron, splicing signal, internal 
ribosome entry site, polyadenylation signal) or a gene encoding a gene product, e.g., a 
polypeptide, or a combination thereof {e.g. , a gene encoding a polypeptide that is under 
the transcriptional control of a promoter). In addition, the heterologous sequence may 

25 be targeted to any replicating genome, e.g. a bacterial cell, a yeast, an insect cell, a plant 
cell, or an animal cell. It is well known in the art as to what regulatory elements, e.g., 
promoters, etc., are suitable for use in, e.g., a bacterial cell versus a plant or animal cell. 

If site specific integration of the passenger sequence is desired (as opposed to 
random integration) a guide sequence must be incorporated into the nucleic acid vector. 

30 Typically the guide sequence contains a portion of the genomic locus to be targeted. 
Considered design of this region allows for the accurate placement of the passenger 
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sequence anywhere in the genome, preferably, e.g., under the control of a desirable 
endogenous promoter. 

To facilitate the construction of the targeting vector, an origin of replication, or 
replicon, and a marker sequence is included in the vector. Specifically exemplified in 
5 the present invention are the markers encoding antibiotic resistance genes to ampicillin, 
kanamycin, and chloramphenicol, however it will be appreciated that other selectable 
markers may be used as described above. In part, the selection of an appropriate marker 
will depend on the organism to be targeted, e.g., prokaryotic versus eukaryotic. The 
characteristics of the foregoing markers and there efficacies in various organism is well 
10 known in the art. 

In addition to an appropriate marker, an origin of replication or replicon is also 
incorporated into the targeting construct to facilitate propagation of the vector during the 
construction and/or initial integration of the vector. Specifically exemplified are the 
pSClOl and R6K-y replicons however other replicons, e.g., colEl, and oriEV as well as 
1 5 replicons suitable for use in eukaryotic cells are also encompassed by the invention. 

Importantly, the constructs of the invention possess site specific recombining 
sites that flank all of the vector sequence that, after targeting of the passenger sequence 
to the genome, is no longer necessary or desired. Indeed, one advantage of the invention 
is that following targeting and integration of the passenger sequence to the genome of 
20 the host cell, the in vivo excision of any undesired sequence can be accomplished using a 
recombinase that binds the site specific recombining sites contained in the integrated 
vector. 

Critical for allowing the correct excision of the undesired vector sequence is the 
placement and directionality of the recombining sites. The novel vectors of the 

25 invention possesses a first and a second recombining site that flank all of the replicon 
and marker sequence and are oriented in the same direction. This allows for the 
complete excision of all the vector sequence leaving behind the desired targeted 
passenger sequence and a minimal remainder of the joined recombining sites (68 bps). 
While the FRT recombining site and corresponding FLP recombinase is 

30 exemplified (Senecoff, J., et al P.N. AS. 82:7270-7274 (1985); Szybalski, W., Gene 

135:279-290 (1993); Hoang, T.T., et al, Gene 212:77-86 (1998)), a number of other site 
specific recombining sites (e.g., dif, att, loxP) and corresponding recombinases (e.g., 
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Xer, Int, Cre) may be used. Accordingly, any of the above described markers may be 
used after the excision step because the recipient host cell has once again been rendered 
sensitive to the selectable marker. Thus, one can sequentially introduce multiple genes 
into an organism without ever running into the difficulty of having to change markers or 
5 exhaust all suitable markers. Accordingly, if a particular selectable marker works 
efficiently in a particular organism, it may be exclusively used to introduce multiple 
passenger sequences into a target organism. 

In a preferred embodiment, the constructs of the invention are modified to 
contain a passenger sequence encoding ethanol pathway gene products such as, e.g., 

10 alcohol dehydrogenase (e.g.,adhB) and pyruvate decarboxylase (pdc), that are necessary 
for ethanol production in, e.g., a microorganism. A guide sequence specific for the 
endogenous gene in the target organism allows for the targeting of these genes to a 
specific locus such that the heterologous ethanol pathway genes (i.e., adhB and pdc) are 
brought under the transcriptional control of an appropriate endogenous promoter {e.g., 

15 adhE). 

Accordingly, the foregoing novel constructs allow for the genetic engineering of 
superior host cells suitable, e.g., for various industrial applications, such as, e.g., use in a 
fermentation reaction for producing, e.g., ethanol (for a review of other industrial 
applications, see, e.g., Barrios-Gonzalez et al., Biotechnol Ann. Rev. 2:85-121 (1996); 

20 From Ethnomycology to Fungal Biotechnology: Exploiting from Natural Resources for 
Novel Products, Singh, J., Ed., Plenum Press, Pub. (1999); Manual of Industrial 
Microbiology and Biotechnology, Demain, A. Ed., Am. Soc. of Microbiology, Pub., 
(1999); Biomining: Theory, Microbes, and Industrial Processes, Rawlings, Ed., R.G. 
Landes Co., Pub. (1997); Biotechnology of Industrial Antibiotics, Vandamme, E., Ed., 

25 Marcel Dekker, Pub. (1 984); Industrial Biotechnology, Malik, V., Ed., Science, Pub. 
(1992); Biotechnology and Food Ingredients, Goldberg et al, Ed., Aspen Publishers 
(1991); Biotechnology and Food Process Engineering, Schwartzberg et al, Ed., Marcel 
Dekker, Pub.(1990); and Food Biotechnology: Techniques and Applications, Mittal, G., 
Technomic Pub. Co. (1992). 
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///. Methods of Use 

In a preferred embodiment, the methods of the invention apply the constructs 
described above for the integration of one or more desirable gene sequences into a host 
cell. Accordingly, the methods of the invention have immediate application in the 
5 genetic engineering of, e.g., microorganisms (e.g., for various industrial applications), 
plants (e.g., for pest resistance, hardiness, yields), and animal cells (e.g., for producing 
cytokines, hormones, etc.). 

In one particular application of the methods of the invention, one or more genes 
necessary for producing ethanol are provided on a plasmid or integrated into a host 

10 chromosome using the method of the invention. More preferably, essential genes for 
fermenting a sugar substrate into ethanol, e.g., pyruvate decarboxylase (e.g.,pdc) and/or 
alcohol dehydrogenase (e.g., adh, preferably adhB) are introduced into the host of the 
invention using an bicistronic operon or PET operon as described in U.S.P.N. 5,821,093, 
hereby incorporated by reference. 

15 Art recognized techniques for introducing nucleic acids into prokaryotic or 

eukaryotic cells such as e.g., bacteria, yeast, insects, plants, and animals as well as 
appropriate culturing methods have been described (see, e.g., Large-Scale Mammalian 
Cell Culture Technology, Lubiniecki, A., Ed., Marcel Dekker, Pub., (1990), Bacterial 
Cell Culture: Essential Data, Ball, A., John Wiley & Sons, (1997), Molecular and Cell 

20 Biology of Yeasts, Yarranton et al, Ed., Van Nostrand Reinhold, Pub., (1989); Yeast 
Physiology and Biotechnology, Walker, G., John Wiley & Sons, Pub., (1998); 
Baculovirus Expression Protocols, Richardson, C, Ed., Humana Press, Pub., (1998); 
Methods in Plant Molecular Biology: A Laboratory Course Manual, Maliga, P., Ed., 
C.S.H.L. Press, Pub., (1995); Current Protocols in Molecular Biology, eds. Ausubel et 

25 al., John Wiley & Sons (1992), Sambrook, J. et al, Molecular Cloning: A Laboratory 
Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, NY (1989), and Bergey 's Manual of Determinative 
Bacteriology, Kreig et al., Williams and Wilkins (1984)). 

Accordingly, using the methods of the invention, a single genetic construct can 

30 be designed to encode all of the necessary gene products (e.g., a glucanase, an 

endoglucanase, an exoglucanase, a secretory protein/s, pyruvate decarboxylase, alcohol 
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dehydrogenase) for imparting to a non-ethanologenic (or weakly ethanologenic 
organism) all the necessary genes for producing ethanol from a sugar substrate. 

Alternatively, the invention allows for more than one gene to be introduced 
serially without having to change markers. In a preferred embodiment, this allows for 
5 the engineering of a host cell suitable for performing simultaneous saccharification and 
fermentation (SSF). In addition, it will also be appreciated that such a host may be 
further manipulated, using methods known in the art, to have mutations in any 
endogenous gene/s (e.g., recombinase genes) that would interfere with the stability, 
expression, and function of the introduced genes. Further, it will also be appreciated that 
1 0 the invention is intended to encompass any regulatory elements, gene/s, or gene 
products, Le. 9 polypeptides, that are sufficiently homologous to the ones described 
herein. 

Methods for screening strains having the introduced genes may be performed 
using, e.g., PCR amplification using site or gene specific primers, visual screens that can 

1 5 identify cells expressing either the desired integrated gene, marker, or absence of the 
marker and any other unnecessary nucleic acid or other art recognized techniques. 

For example, for screening an ethanologenic microorganism, the ADH gene 
product produces acetaldehyde that reacts with the leucosulfonic acid derivative of 
p-roseaniline to produce an intensely red product. Thus, ADH-positive clones can be 

20 easily screened and identified as bleeding red colonies. 

Rapid evaluations of ethanol producing potential can also be made by testing the 
speed of red spot development on aldehyde indicator plates (Conway et al. y (1987) J. 
Bacteriol. 1 69:2591-2597). Typically, strains which prove to be efficient in sugar 
conversion to ethanol can be recognized by the production of red spots on aldehyde 

25 indicator plates within minutes of transfer. 

In a most preferred embodiment of the invention, the methods of the invention 
allow for the production of a single host cell that is ethanologenic, that is, has all the 
necessary genes, either naturally occurring or artificially introduced or enhanced (e.g., 
using a surrogate promoter and/or genes from a different species or strain), such that the 

30 host cell has the ability to produce and secrete a polysaccharase/s, degrade a complex 
sugar, and ferment the degraded sugar into ethanol. Accordingly, such a host is suitable 
for simultaneous saccharification and fermentation. Moreover, this host cell is free of 
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marker sequences, unnecessary vector sequence, a heterologous replicon, and these 
sequences can lead to genomic instability in the organism or make the organism less 
suitable for release or use outside, e.g., a controlled laboratory environment. 

In addition, it should be readily apparent to one skilled in the art that the ability 
5 conferred by the present invention, to transform genes coding for a protein or an entire 
metabolic pathway into a host cell without leaving resident in the host genome 
unnecessary or undesired nucleic acid has wide application. Envisioned in this regard, 
for example, is the application of the present invention to a variety of situations where 
regulatory elements and/or genes may be added at will and moreover, deleterious genes 
10 deleted, without unnecessarily littering the genome of the target organism with 
unwanted nucleic acid. 

IV. Host Cells 

The invention also relates to host cells that are modified using the foregoing 

1 5 methods and/or nucleic acid constructs of the invention. Preferably, any host cell, e.g. , 
animal, plant, or microbe, can be modified according to the methods of the invention. 
More preferably, the methods and constructs of the invention allow for the engineering 
of heterologous DNA into the genome of any microorganism suitable, e.g., for an 
industrial application. Even more preferably, the invention provides for the engineering 

20 of an organism useful for the production of ethanol, e.g., by fermentation. 

Accordingly, in one embodiment, a host cell of the invention comprises a 
heterologous polynucleotide segment encoding a polypeptide under the transcriptional 
control of a naturally occurring promoter in the genome or, if appropriate, under the 
control of a heterologous promoter. 

25 In another embodiment, the host cell prepared according to the method of the 

invention is exposed to a recombinase such that the in vivo excision of any unnecessary 
nucleic acid flanked by recombining sites is accomplished while the targeted passenger 
sequence encoding a desirable gene is left behind. Typically, the nucleic acid sequence 
between the recombining sites contains, for example, a marker sequence and may further 

30 contain an origin of replication or replicon. The host cell of the invention is render free 
of this sequence by being exposed to a site specific recombinase, preferably encoded by 
an expressible construct, which is introduced into the cell using standard techniques. 
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In a preferred embodiment, the resultant recombinant host cell is made 
substantially free of any nucleic acid flanked by FRT recombining sites using an FLP 
recombinase. It is understood that the host cell of the invention may also be modified 
using nucleic acid constructs having recombining sites such as a dif sequence, att 
5 sequence, or loxP sequence and made substantially free of any undesired nucleic acid 
using a corresponding recombinase, such as, e.g., Xer, Int, or Cre. 

The resultant host cell as prepared according to the method of the invention has 
the advantage of no longer having any unnecessary heterologous DNA which can lead 
to, e.g. , genomic stability. In addition, the host cell no longer encodes a marker 

1 0 sequence and thus can be retargeted with the same selectable marker sequence in order 
to introduce another genetic modification to the cell. Thus, the serial introduction of 
multiple genes can be achieved using the same selection marker. In theory at least, there 
appears to be no limit as to the number of gene constructs that can be introduced into a 
host cell of the invention. 

15 In a most preferred embodiment, the host cell of the invention contains at least 

one passenger sequence containing, e.g., a gene of interest, e.g., a gene encoding a 
desired polypeptide for use in the bioconversion of a sugar to ethanol, or a step thereof. 
Preferred ethanologenic genes include those that encode, e.g., any ethanol pathway gene 
that can facilitate the production of ethanol from the cell or extract thereof, such as 

20 alcohol dehydrogenase, pyruvate decarboxylase, a cellulase, or a secretory polypeptide. 
A desired ethanologenic gene may be derived from another species, e.g., a yeast, an 
insect, an animal, or a plant. The techniques for introducing and expressing such genes 
in a recombinant host, are presented in the examples. 

For example, a non-ethanologenic host cell may be converted into an 

25 ethanologenic host by introducing, for example, ethanologenic genes from an efficient 
ethanol producer like Zymomonas mobilis. This type of genetic engineering, using the 
constructs and methods of the invention, results in a recombinant host capable of 
efficiently fermenting sugar into ethanol. 

Accordingly, the invention makes use of a non-ethanologenic recombinant host, 

30 e.g. , E. colx strain B or E. coli strain K-12. These strains may be used to express a 
desired polypeptide, e.g., an ethanologenic gene introduced using techniques describe 
herein. In a preferred embodiment of the present invention, the host cell having been 
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modified using the methods and constructs of the invention, can be applied in degrading 
or depolymerizing a complex saccharide into monosaccharides. Subsequently, the host 
cell can catabolize the simpler sugar into ethanol by fermentation. This process of 
concurrent complex saccharide depolymerization into smaller sugar residues followed 
5 by fermentation is referred to as simultaneous saccharification and fermentation (SSF). 

In another embodiment, the invention makes use of a recombinant host that is 
ethanologenic and is improved using the methods of the invention. It is understood that 
an improvement of an ethanologenic host may include, e.g., increasing the ability of 
organism to produce ethanol, depolymerize a particular substrate, tolerate a higher 

10 ethanol level, etc. In a preferred embodiment, the recombinant host is a Gram-negative 
bacterium. In another embodiment, the recombinant host is from the family 
Enterobacteriaceae. The ethanologenic hosts of U. S.P.N. 5,821,093, hereby 
incorporated by reference, for example, are suitable hosts and include, in particular, E. 
coli strains K04 (ATCC 55123), KOI 1 (ATCC 55124), and K012 (ATCC 55125). In 

15 addition, the LY01 strain may be employed as described in U.S.S.N. 08/834,900 and this 
application is hereby incorporated by reference. 

It will be appreciated that host cells prepared according to the method of the 
invention may be used in conjunction with another recombinant host that expresses, yet 
another desirable polypeptide, e.g., a different passenger gene sequence. In a particular 

20 example, a non-ethanologenic host cell may be used in conjunction with an 

ethanologenic host cell and either one or both of these host cells may be engineered 
according to the methods of the invention to accomplish, e.g., one step of a multistep 
process, e.g., the converting of a substrate into ethanol. For example, a non- 
ethanologenic host may be engineered for carrying out, e.g., the depolymerization of a 

25 complex sugar followed by the use of an engineered ethanologenic host for fermenting 
the depolymerized sugar. Accordingly, it will be appreciated that the host cells of the 
invention may be used serially or contemporaneously for carrying out a particular 
process. 

This invention is further illustrated by the following examples which should not 
30 be construed as limiting. 
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EXEMPLIFICATION 
EXAMPLE I 

5 Construction of Targeting Vectors for Introducing Heterologous DNA that is 

Excisable In Vivo 

In this example, selectable targeting vectors for introducing heterologous DNA 
into a recipient host cell are described. 
10 Throughout the example, the following materials and methods were used unless 

otherwise stated. 

Materials and Methods 

The bacterial strains and plasmids used in this example are listed in Table 1, 

15 below. 



i 
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Table 1. Bacterial strains and plasmids 



EL coli 
STRAINS 


RELEVANT GENOTYPE 


SOURCE or 
REFERENCE 


DH5a 


/acZAM15 recAl endA\ hsdR\7 [r K "m K + ] supE44 


Promega 


S17-1 


thipro recA hsdR RP4-2-te/::Mu aphA::Tnl Sp r Tp r 


deLorenzo et 
aL> 1990 


TOPI OF' 


F' {/acl q TnlOOV)} mcrA &(mrr-hsdRMS-mcrBC) (f)80 
lacZAM\5 AIacX74 recA\ deoK araD\?>9 A(ara-leu)l 697 gaflj 
galK rpsL (SO end A 1 nupG 


Invitrogen 


SE2272 


prototrophic strain B, Afrd\02 zjd 


K. T. 

Shanmugan 
collection 


SE2275 


prototrophic strain K-12, Afrd\02 zjd 


K. T. 

Shanmugan 
collection 


FM7 


SE2272 derivative containing pLOI223 1 integrated (Km K , Cm K ) 


This study 


FM19 


SE2275 derivative containing pLOI223 1 integrated (Km K , Cm*) 


This study 


FM18 


FM7 derivative (Km\ Cm*) 


This study 


FM20 


FM19 derivative (Km*, Cm*) 


This study 


LY01 


frd pfl::pdcadhBcat ale* 


Yomano et ai> 

1 Ind 

Microbiol 

Biotech, 

20:132-138 

(1998) 


PLASMIDS 


DESCRIPTION 


SOURCE OR 
REFERENCE 
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Table 1. Continued 



pUC18 


coIElori; Ap* 


Messing, J., 

Methods 

Enzymol. 

101:20-78 

(1983) 


pCR2.1-TOPO 


coIElori; Ap*; Km* 


Invitrogen 


LITMUS38 


coIElori; M13ori; Ap K 


New England 
Biolabs, Inc. 


pSG76A-K-C 


y-R6Kori; FRT; Ap K -Km K -Cm K 


Posfai et al. y 
1997 


pST76A-K-C 


pSC 1 0 1 on; FRT; Ap n -Km*-Cm* 


Posfai et ai y 
1997 


PFT-A 


pSClOlori; FLP gene under control of the tetracycline repressor; 

A R 

Ap 


Posfai et ai , 
1997 


pLOI2216 


Xmnl/Hindlll fragment substitution from pSG76-A into pUC18; 
FRT1; colElon; Ap 


This study 


pLOI2217 


HindlU site deletion of pLOI2216; FRT1; coIElori; Ap* 


This study 


pLOI2218 


EcoKMPstl fragment deletion of pLOI22l7; FRT1; coIElori; 

A R 

Ap R 


This study 


pLOI2219 


£coRl site substitution of C/al site into PL012218; FRT1; 
coibion, Ap 


This study 


pLO12220 


ScaVEcoRl fragment substitution from pLOI2401 into 

nT OI9919- FRT1 ■ rnlFlori* An R 


This study 


pLOI2221 


Ascl site substitution of EcoRl site into pLO12220; FRT1; 
coIElori; Ap R 


This study 


pLOI2222 


HindlU site deletion of pLOI2221; FRT1; coIElori; Ap* 


This study i 


pLOI2223 


EcoRI/Sphl fragment substitution from pLOI2222 into pSG76-A; 
FRT1 ; FRT2; y-R6Kori; Ap R 


This study | 


pLOI2224 


EcoRI/Sphl fragment substitution from pLOI2222 into pSG76-K; 
FRT1 ; FRT2; y-R6Kori; Km R 


This study 


pLOI2225 


EcoRUSphl fragment substitution from pLOI2222 into pSG76-C; 
FRT1 ; FRT2; y-R6Kori; Cm R 


This study 
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Table 1. Continued 



piAJlZZZO 


tiCotsj/opni Fragment suDscitution irom pLUizzzz into poi /o-a, 
FRT1; FRT2; pSClOlori; Ap R 


i nis study 


pLUlZZZ / 


hcotsA/opnl rragment substitution irom pLUlzzZz into pol fo-K, 
FRT1; FRT2; pSClOlori; Km R 


l nis study 


pLVJIZZZO 


iLcoi\ii opnv iragmeni suDsinuiion irom pLvJizzzz mio po i /o-l,, 
FRT1- FRT7- n^PIOInri- Pm R 


This study 


nl 012229 


FrnR] fragment from nl O1240R intn nT 01?4fW enlFlnri- An K 

Km R 


i ins biuuy 




/?/imHI firaompnt frnm nl OTSIft /"Ohta &t nl 1000 intn 
LHJitu ii iiaguiciii iiuiii ^Luuiv ^vi ilu tt* ui. s lyyij ujiu 

pLOI2229; colEIori; Ap R ; Km R ; Cm R 


[ hie chirlv/ 

i nis siuuy 




Asa iragmeni irom pLrVJizzju into pujizzzH, y-rvOKorij ivm , 
Cm R 


1 ins biuuy 


nl OT2400 


/lJCl ollv aUUdlllUllUIl Lrl OfsriL ollC 11HU |JUV/ 1 O, CU1.0 1UI1, rtU 


i nib siuuy 


pLOI2401 


BgM site substitution of Xp/il site into pLOI2400; colEIori; Ap K 


This study 


DLOI2402 


Asc\ site substitution of Ana\ site into I TTMlJSIR- cnlFlori* 
M13ori; Ap R 


Thic ctiiHv 
t ins aiuuy 


pLOI2403 


/ted site substitution of Sfwl site into pLOI2402; colEIori; 
M13ori; Ap R 


This study 


pLOI2408 


adhE gene into pCR2.1-T0P0; colEIori; Ap*; Km" 


This study 


pLOI2413 ; 


adhE gene into pLOI2403; colEIori; M13ori; Ap K 


This study 


pLOISlO 


promoterless pdc adhB genes from Z mobilis and ca/ gene 
cloned into BamHl site of pUC18; colEIori; Ap R ; Cm R 


Ohta etai, 
1991 



For all plasmid constructions, standard methods were employed (Sambrook, J., et 
5 al Molecular Cloning, A Laboratory Manual Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, #7(1989)). Reagents used in cloning were molecular biology 
grade and used as directed by the manufacturers. Restriction enzymes, T4 DNA 
polymerase, and Klenow polymerase were purchased from New England Biolabs 
(Beverly, MA). Taq PCR MasterKit was purchased from Qiagen (Santa Clarita, CA). 
10 Polymerase chain reactions (PCR) were carried out using an Eppendorf Mastercycler 
(Westbury, NY). Primers were obtained from Genosys Biotechnologies (The 
Woodlands, TX). PCR products were ligated into pCR2. 1-TOPO (Invitrogen, Carlsbad, 
CA) using topoisomerase. DNA fragments were isolated from gels using a QIAquick 
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Gel Extraction Kit. DNA fragments were assembled using a Rapid DNA ligation Kit 
(Boehringer Mannheim Corporation, Indianapolis, IN). Wizard Plus kits (Promega, 
Madison, WI) were used for plasmid purification. Dideoxy sequencing of plasmids was 
performed by using fluorescent primers and a LI-COR model 4000L DNA Sequencer 
5 (Lincoln, NB) as previously described (Lai, X., et ai, Appl Envir. Microbiol 63:355- 
363 (1996)). Except where mentioned, all ligation reactions were done with the Rapid 
DNA Ligation Kit and used to transform £ coli DH5cc, followed by selection in LB 
plates containing ampicillin. 

1 0 Plasmid Constructions 

Two sets of insertion vectors were constructed using, as an initial vector, the 
suicide plasmids pSG76 and pST76 series, described by Posfai et al (1997). These low 
copy number plasmids contain the conditional replicons R6K-y and pSClOl, 
respectively. Modifications were executed in several steps so that another recombining 

15 site, /.e., FRT site, was introduced into each of original vectors, as well as a more 

suitable multiple cloning site (MCS). Unless mentioned in the text, all new plasmids are 
shown in brackets, just after its corresponding description. Complete sequences for the 
following plasmids have been deposited in GenBank with the following accession 
numbers: pLOI2223, AF172933; pLOI224, AF172934; pLOI225, AF172935; pLOI226, , 

20 AF172936; pLOI227, AF172937; pLOI228, AF172938 (see also, respectively, SEQ ID 
NOS: 1-6). 

To facilitate the manipulation of each construct, most of the plasmid 
manipulations were done in pUC18 (Messing, J., Methods Enzymol. 101:20-78 (1983)) 
and derivatives made herein. An 800 bp Xmnl-Hindlll fragment from pUC18 was 
• 25 exchanged for a fragment comprising part of the bla gene plus the FRT site from 
pSG76-A to create pLOI22 1 6. Digestion, Klenow treatment and self-ligation 
[pLOI22 1 7] abolished the Hindlll site in pLOI22 1 6. The restriction sites from EcoRl to 
Pstl were removed by double-digestion with these enzymes, Klenow treatment and self- 
ligation [pLOI2218]. Then a new £coRI site was created by linker addition into the 
30 previously added Cla\ site, giving rise to a pUCl 8- plasmid derivative carrying one FRT 
site (called FRT1 in Table 1) but no MCS [pLOI2219]. 
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To introduce a new MCS into pLOI2219, the Sphl site from pUC18 was replaced 
with an Ascl site by linker addition [pLOI2400], and the Kpnl site was replaced with a 
flg/II linker [pLOI2401]. The modified pUC18-based MCS was then introduced into 
pLOI2219 as a 965 bp Scal/EcoRl fragment originating form pLOI2220. By linker 
5 addition, an Ascl site took the place of EcoKL site [pLOI2221], and a new £coRI site 
replaced the Hindlll site [pLOI2222]. 

The resultant plasmid was sequenced in both direction and used as a source of 
DNA carrying the modified MCS and the FRT1 site. 

1 0 Construction of pL 012223 

To create the conditional replication vector carrying two FRT sites and the 
ampicillin-resistance gene, pLOI2222 was double-digested with EccRl and Seal and a 
157 bp fragment was ligated to pSG76-A (R6K-y ori, Ap R ) digested with the same 
enzymes. Transformation was done in E. coli SI 7-1, since this strain provides the Xpir 

15 protein involved in the replication of the R6K-y ori and selection was done in the 
presence of 50 fig/ml ampicillin. The resultant plasmid was denominated pLOI2223. 

Construction of pLOI2 2 24 

To create the conditional replication vector carrying two FRT sites and the 

20 kanamycin-resistance gene, pLOI2222 was double-digested with EcoKl and Seal and a 
157 bp fragment was ligated to pSG76-K (R6K-y ori, Km R ) digested with the same 
enzymes. Transformation was done in E. coli SI 7-1, since this strain provides the Xpir 
protein involved in the replication of the R6K-y ori and selection was done in the 
presence of 50 ng/ml kanamycin. DNA from several colonies was prepared and 

25 analyzed by digestion with Ascl and Bglll, separately, to confirm the presence of the 
modified MCS. The resultant plasmid was denominated pLOI2224. 

Construction of pLOI2225 

To create the conditional replication vector carrying two FRT sites and the 
30 chloramphenicol-resistance gene, pLOI2222 was double-digested with EcoRI and Seal 
and a 157 bp fragment was ligated to pSG76-C (R6K-y ori, Cm R ) digested with the same 
enzymes. Transformation was performed as described above and selection was done in 
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the presence of 50 ^g/ml chloramphenicol. The resultant plasmid was termed 
pLOI2225. 

Construction of pLOI2226 
5 To create the thermosensitive replication vector carrying two FRT sites and the 

ampicillin-resistance gene, pLOI2222 was double-digested with EcoRI and Seal and a 
157 bp fragment was ligated to pST76-A (pSClOl ori, Ap R ) digested with the same 
enzymes. For plasmids carrying the thermosensitive replicon (pSClOl), £ coli SI 7-1 
transformed cells were kept at 30°C and cells were selected with 40 jag/ml ampicillin. 
1 0 The resultant plasmid was termed pLOI2226. 

Construction ofpLO!2227 

To create the thermosensitive replication vector carrying two FRT sites and the 
kanamycin-resistance gene, pLOI2222 was double-digested with EcoRl and Seal and a 
15 157 bp fragment was ligated to pST76-K (pSClOl ori, Km R ) digested with the same 
enzymes and transformation was performed as described above and selection was done 
in the presence of 40)ig/ml of kanamycin. The resultant plasmid was termed pLOI2227. 

Construction of pLOI2228 

20 To create the thermosensitive replication vector carrying two FRT sites and the 

chloramphenicol-resistance gene, pLOI2222 was double-digested with EcoKl and Seal 
and a 157 bp fragment was ligated to pST76-C (pSClOl ori, Cm R ) digested with the 
same enzymes and transformation was performed as described above and selection was 
done in the presence of 40|ag/ml of chloramphenicol. The resultant plasmid was termed 

25 pLOI2228. 

Construction of PL012231 

The construct pLOI2231 (SEQ ID NO: 7) is a derivative of pLOI2224 where 
adhE (guide) and the PET (passenger) genes have been cloned. The PET passenger 
30 genes comprise adhB and pdc. To obtain this construction the adhE gene was first 

cloned into pCR2.1-TOPO vector as indicated by manufactured. Briefly, the adhE gene 
was PCR amplified using a pair of Genosys ORFmer primers: 
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5 ' TTGCTCTTCCATGGCTGTTACTAATGTCGCTG7^3 ' forward primer; 
5 'TTGCTCTTCGTTAAGCGGATTTTTTCGCTTTTTTCT3 ' reverse primer. The PCR 
product of adhE includes the initiation codon ATG and the termination codon TAA. A 
fresh PCR sample of adhE was ligated to the pCR2. 1-TOPO vector. This plasmid was 
5 designated as pLOI2408. A subcloning step was performed to transfer adhE from 
pLOI2408 to pLOI2403. Plasmid pLOI2408 was digested with restriction enzyme 
EcoRl. The product of this digestion allows to obtain the whole PCR product previously 
cloned. The adhE fragment was separated by gel electrophoresis and isolated from the 
gel using the QIAGEN kit. 

10 The isolated adhE fragment was visualized by UV light after ethidium bromide 

staining and gel electrophoresis. This fragment was ligated to plasmid pLOI2403. The 
construct pLOI2403 was previously digested with restriction enzyme EcoRl. The adhE 
gene fragment and pLOI2403 were ligated using the rapid DNA ligation kit as directed 
by manufacturer. The plasmid obtained after the ligation between the adhE EcoRl 

1 5 digested fragment and pLOI2403 EcoRl digested plasmid was designated pLOI241 3. In 
this plasmid, the 3 1 end of the adhE gene is orientated to be close to the BamHl site of 
pLOI2403 as revealed by sequence analysis. 

In order to ligate the PET genes (passenger) to adhE gene (guide) a ligation was 
performed between a PET fragment and the plasmid pLOI241 3. PET genes were 

20 isolated from plasmid pLOI5 1 0 as a BamHl fragment of about 4.4 Kbp. The PET EcoRl 
digested fragment was isolated from a gel after electrophoresis of plasmid pLOI510 
BamHl digested with the QIAGEN kit. The PET DNA fragment was visualized by 
ethidium bromide after gel electrophoresis. A ligation of the isolated PET fragment with 
plasmid pLOI241 3 previously digested with restriction enzyme BamHl was performed. 

25 The plasmid obtained was designated pLOI2230. 

To obtain an integration vector with guide and the passenger fragments as shown 
in Fig. 3, the plasmid pLOI2224 was ligated to a fragment containing adhE and PET 
genes isolated from plasmid pLOI2230. The resultant plasmid termed pLOI2230 was 
digested with the restriction enzyme Ascl and the digestion product was cloned into 

30 pLOI2224 previously digested with enzyme Ascl. The plasmid obtained from this 

ligation was designated pLOI2231 and the orientation of the various genetic elements of 
the plasmid are shown in Fig. 3 
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Targeting Strategy 

Recombinase-based integration systems offer the opportunity to effect precise 
DNA deletions in vivo (Cherepanov, P. P., et al Gene 158: 9-14 (1995); Posfai, G. et al, 
5 Nucleic Acids Res. 22: 2392-2398 (1994)) and in vitro (Cox, M. M., Proa Natl Acad 
Sci. USA 80: 4223-4227 (1983); Hasan, N., et al, Gene 150: 51-56 (1994), Wild, J., Z. 
et al , Gene 179:181-188(1 996)). The present invention encompasses plasmids as 
described by Posfai etal (J. Bacieriol 179: 4426-4428 (1997)) that were further 
modified to incorporate homologous DNA as a guide sequence, two FRT sites in the 
1 0 same orientation flanking an antibiotic marker and a multi cloning site (MCS) (Fig. 1 ). 

A full set of plasmids (ampicillin, chloramphenicol, and kanamycin resistance) 
was made for each conditional replicon (Fig. 1; pSClOl and R6K-y). Since both 
conditional replicons are present at low copy numbers, an additional high copy vector 
was developed to facilitate constructions by adding anAscl site on either side of the 
1 5 MCS in pLITMUS38 (New England Biolabs, Beverly, MA) to produce pLOI2403 (Fig. 
1). 

A general procedure for chromosomal integration of DNA using the new vectors 
with conditional replicons is presented in Fig. 2. The integration of heterologous 
passenger DNA carrying desired functions can be targeted to any specific chromosomal 
20 site by an adjacent fragment of homologous DNA (guide) during growth at 30°C 

(pSClOl), or directly selected at 37-42°C (pSClOl or R6K-y). With a single cross-over 
event, the entire plasmid is incorporated into the chromosome (If needed, pSCl 01 -based 
integration vectors can be eliminated by overnight growth and plating at elevated 
temperatures.). 

25 After integration, recombinants were transformed with pFT-A containing the 

yeast FLP gene under control of the tetracycline promoter and grown under permissive 
conditions (30°C, pSClOl). During growth with chlortetracycline, FLP recombinase 
was induced which in turn resulted in the excision of the DNA bracketed by the 
concurrently-facing FRT sites (selectable marker and replicon) from the chromosome as 

30 shown in Fig. 3. After growth at 37-42°C to eliminate plasmid pFT-A encoding the 
recombinase, only the passenger gene(s), a single FRT, and the homologous guide 
fragment remained in the chromosome as described in further detail below. 
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EXAMPLE II 

A Method for Engineering an Ethanologenic Host Cell by Chromosomal 
Integration of Heterologous DNA 

5 

In this example, methods for introducing heterologous DNA to engineer an 
ethanologenic organism are described. 

To illustrate the utility of the foregoing vectors, heterologous genes where 
targeted into the genome of two different test microorganism in order to generate a 

1 0 genetically engineered organism with a desired phenotype (/. e. , ability to produce 

ethanol). In particular, derivatives of E. coli B (strain SE2272, Afrd ) and E. coli K-12 
(SE2275, Afrd) where constructed in which three heterologous genes were integrated 
immediately behind the endogenous adhE gene in the chromosome. The guide and 
passenger DNA were cloned into pLOI2403, a high copy plasmid vector described 

15 above. For this construction, the promoterless adhE coding region (guide) was 
amplified with Genosys ORFmer primers for the coding region (forward, 
5 ' TTGCTCTTCCATGGCTGTTACTAATGTCGCTGAA3 ' ; reverse, 

5' TTGCTCTTCGTTAAGCGGATTTTTTCGCTTTTTTCT3' ) and cloned into pCR2.1-TOPO 
to produce pLOI2408 (see for example Fig. 3). After EcoBl digestion, the 2.6 Kbp adhE 

20 region from pLOI2408 was moved into the corresponding site in pLOI2403 to produce 
pLOI2413. The BamHl site immediately downstream from the 3 r end of the adhE 
coding region was used to insert a 4.6 Kbp BamHl fragment from pLOI510 (Ohta, K. et 
al 9 Appl Environ. Microbiol. 57: 893-900 (1991)) containing three genes (passenger): a 
promoterless Zymomonas mobilis pdc without transcriptional terminator and a 

25 promoterless Z mobilis adhB with a transcriptional terminator, followed by a complete 
cat operon with promoter and terminator. In the resulting plasmid (pLOI2230), 
transcription of the heterologous genes was oriented concurrently with adhE. All 
constructs containing the Z. mobilis genes were grown in LB supplemented with glucose 
(20 g/L for plates; 50 g/L for broth). 

30 The 7.2 Kbp Ascl fragment from pLOI2230 (high copy vector) containing adhE, 

the artificial operon pdc adhB, and cat was ligated into the low copy integration vector 
pLOI2224 which contains a R6K-y replicon (A/wr-dependent) and transformed into the 
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permissive host S17-1 (de Lorenzo, V., et aL, J. BacterioL 172: 6568-6572 (1990)) with 
selection for kanamycin and chloramphenicol. The resulting clone containing pLOI223 1 
was used for large-scale plasmid isolation (500 ml) by the alkaline lysis procedure 
(Sambrook, J., Molecular Cloning. A Laboratory Manual, Cold Spring Harbor 
5 Laboratory Press, Cold Spring Harbor, NY (1 989)). 

Approximately 500 ng of pLOI2231 DNA was used for electroporation of 
SE2272 and SE2275. Both are non-permissive hosts. Recombinants were readily 
obtained by selection for either kanamycin (vector) or chloramphenicol (passenger) 
resistance. Up to 2 h was allowed for expression of the resistance gene prior to 

10 spreading on plates for selection. Approximately 1000 recombinants per 1 |ig DNA 
(electroporation) were recovered using E. coli K-12 SE2275, 5-fold higher than that 
obtained with the E. coli B SE2272. Thirty recombinants from each host were screened 
for the functional expression of alcohol dehydrogenase on indicator plates (Conway, 
T.C., et aL, J. BacterioL 169:2591-2597 (1987)). Based on the rate and intensity of 

1 5 color development, these recombinants expressed higher levels of ADH activity than the 
respective unmodified SE2272 or SE2275, or S17-l(pLOI2231) harboring promoterless 
pdc and adhB genes. Unlike the control strains, these recombinants also exhibited a 
colonial phenotype (large raised colonies on LB containing glucose) that is typical for 
ethanologenic E. coli (Ingram, L.O., et aL, Environ. Microbiol. 53: 2420-2425 (1987)). 

20 Small-scale DNA preparations (7 recombinants per host) were tested for the presence of 
pLOI223 1 . None contained plasmids as tested by gel filtration or based on 
transformation experiments using SI 7-1 as the host. These recombinants were 
presumed to contain chromosomally integrated genes. One clone from each parent, 
strains FM7 (JE coli B SE2272) and FM19 (E coli K-12 SE2275), was selected for 

25 further study. 

Accordingly, strains FM7 and FM19 were transformed with the helper plasmid 
(pFT-A) carrying the FLP gene (Posfai, G., et al.,J. BacterioL 179: 4426-4428 (1997)) 
and incubated at 30°C with selection for ampicillin resistance. A mixture of colonies 
was used to inoculate a broth culture for induction of FLP with autoclaved 
30 chlortetracycline (20 ng/ml). After 6 h incubation at 30°C, the culture was diluted 

1 : 1000 in LB containing glucose and incubated at 42°C for 16 h to eliminate the helper 
plasmid. After streaking on solid medium, isolated colonies were screened for the 
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absence of antibiotic markers. Approximately 80% of the colonies were ampicillin and 
kanamycin sensitive and retained only chloramphenicol resistance and the ethanologenic 
traits (passenger genes inserted into the MCS). Loss of ampicillin resistance indicated 
that the helper plasmid had been successfully eliminated while loss of kanamycin 
5 resistance confirmed the FLP-recombinase-dependent deletion of the vector. These new 
derivatives of FM7 and FM19 were designated FM18 and FM20, respectively. 

PCR was used to verify the integration events in both FM1 8 and FM20. Two 
new sets of primers were designed to amplify the adhE gene including the unique 
junctions predicted for pdc (Fig. 3, primers 3 and 4) and cat (Fig. 3, primers 5 and 6) as 

10 a result of integration and FLP-mediated deletion. Forward primer 3 hybridizes to the 
promoter region of adhE while reverse primer 4 hybridizes to the N-terminal coding 
region of pdc. Forward primer 5 hybridizes to the C-terminal coding region of cat gene 
and reverse primer 6 hybridizes to the 3* untranslated portion of the adhE. Note that the 
primers used to clone adhE, forward primer 1 and reverse primer 2, hybridize to the N- 

1 5 terminal and C-terminal coding regions of adhE and are inside of the regions encoded by 
forward primer 3 and reverse primer 6. All primer sets (SE2272 template: 1+2, 3+6; 
FM18 template: 3+4, 5+6) generated products of the expected sizes (Fig. 4 A). Identical 
results were also obtained using FM20 DNA as the template. As shown in Fig. 3, and 
arrows with numbers 1, 2, 3, 4, 5, and 6 represent primers used to amplify the 

20 corresponding regions. The sequences for these primers are as follows: Primer 1 . 
5 , TTGCTCTTCCATGGCTGTTACTAATGTCGCTG7^A3' ; Primer 2. 
5' TTGCTCTTCGTTAAGCGGATTTTTTCGCTTTTTTCT3' ; Primer 3. 
5' GTGAGTGTGAGCGCGGAGT3 ' ; Primer 4. 5' TGGCACGAGCATAACCTTC3 ' ; Primer 5. 
5' CAGTACTGCGATGAGTGGCA3 ' ; Primer 6. 5' GTTGCCAGACAGCGCTACT3 ' . 

25 The adhE gene contains a single central BstEll site which does not occur 

elsewhere in the PCR products. This site was used to verify the identity of the PCR 
fragments. As shown in Fig. 4B, all PCR products were cut once to produce fragments 
containing the N-terminal and C-terminal regions of adhE. Fragments from the adhE 
coding region alone (primers 1+2) were smaller (N-terminal fragment = 1,226 bp; C- 

30 terminal fragment = 1,470 bp) than fragments which included parts of the native adhE 
promoter (primers 3+4 and 3+6; N-terminal fragment = 1,325 bp) or adhE terminator 
(primers 3+6, 5+6; C-terminal fragment = 1,489 bp). The fragment which included part 
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of pdc (primers 3+4) was the largest C-terminal fragment (1,783 bp). The fragment 
which included part of cat (primers 5+6) was the largest N-terminal fragment (1,988 bp). 

In Fig. 4, panel A shows full length PCR products of adhE junctions. Positions 
and sequences of primers are provided in Fig. 3. The lanes in panel A of Fig. 4 
5 represent: 1 ., Hindlll digest of phage X DNA; 2., adhE coding region (2,696 bp) of 
SE2272 amplified using forward primer 1 and reverse primer 2; 3., adhE promoter and 
3' untranslated sequence (2,814 bp) of SE2272 amplified using forward primer 3 and 
reverse primer 4; 4., The adhE and pdc junction (3,108 bp) of FM18 amplified using 
forward primer 3 and reverse primer 4; 5., The cat junction and adhE (3,477 bp) of 

1 0 FM1 8 amplified using forward primer 5 and reverse primer 6. The amplification 

conditions for primer pairs 1+2 and 3 + 6 (25 cycles) were: 45 sec at 94°C, 45 sec at 
60°C, 60 sec at 72°C The amplification conditions for primer pairs 3+4 and 5 + 6 (30 
cycles) were: 45 sec at 94°C, 60 sec at 60°C. DNA was held at 94°C for 3 min prior to 
the first cycle. The elongation time was increased to 10 min during the final cycle. 

15 In Fig. 4, panel B shows a BstEII digestion of the above-described PCR products. 

A single, central BstEW site was used to cleave PCR products containing adhE into a N- 
terminal and C-terminal fragments. The lanes in panel B of Fig. 4 show the following: 
Lane 1 contains a DNA standard (descending: 3.0, 2.0, 1.5, 1.2, 1.0 Kbp); Lanes 2 
through 5 contain BstEII digested PCR products of fragments described in 4A, 

20 respectively. 

The expression of adhE is regulated by a number of factors in E. coli including 
era, adhR, and the abundance of NADH (Leonardo, M.R., et al 9 J. BacterioL 175: 870- 
878 (1993); Leonardo, M.R., et aL, J. BacterioL 178: 6013-6018 (1996); Mikulskis, A., 
et al. 9 J. BacterioL 179: 7129-7134 (1997)). Both message levels and activity are 

25 approximately 10-fold higher during anaerobic growth with glucose than during aerobic 
growth. Since the Z mobilis genes are integrated behind the adhE coding region to form 
an operon fusion, expression of pdc should also increase in response to anaerobiosis. 
Strains FM18 and FM20 were grown in Luria broth containing 50 g glucose/liter under 
aerobic and anaerobic conditions. PDC activities were determined in heat-treated 

30 preparations to eliminate lactate dehydrogenase and other confounding activities 
(Conway, T., et aL t J. BacterioL 169: 949-954 (1987)). Under anaerobic conditions, 
PDC activities in FM18 and FM20 were 0.254 U/mg protein and 0.185 U/mg protein, 
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respectively, approximately 4-fold higher than the activities observed in cells grown 
under aerobic conditions. 

The above results demonstrate that the new integration vectors can be used to 
place promoterless genes under the control of a chromosomal promoter in a site specific 
5 fashion. Moreover, after the integration event, unnecessary nucleic acid encoding a 
replicon and selectable marker was removed in vivo using a recombining site specific 
recombinase. This approach avoids potential problems of lethality or mutation due to 
unregulated expression in plasmids during construction and integration. These vectors 
can also be used to replace promoters in chromosomal genes. Additional unique 

10 restriction sites are available for the insertion of genes which can be temporarily 
expressed after integration and subsequently deleted with the replicon and selectable 
marker. This option, the temporary introduction of new genes, may be useful to test new 
traits in an isogenic background. 

Although the vectors described must be propagated in E. coli, they are 

1 5 potentially useful with other organisms. The FLP recombinase is extremely efficient 

(Wild, J., Z., et ai, Gene 179:181-188 (1996)) and could be produced intracellular^ as a 
transient expression product after transformation or electroporation of pFT-A. 

Equivalents 

20 Those skilled in the art will recognize, or be able to ascertain using no more than 

routine experimentation, many equivalents to the specific embodiments of the invention 
described herein. Such equivalents are intended to be encompassed by the following 
claims. Moreover, any number of genetic constructs, host cells, and methods described 
in United States Patent Nos. 5,821,093; 5,482,846; 5,424,202; 5,028,539; 5,000,000; 

25 5,162,516; and U.S. patent application serial No. 60/136,376 may be employed in 
carrying out the present invention and are hereby incorporated by reference. 
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Claims 

1 . A method for integrating a nucleic acid construct into the genome of a host cell, 
said method comprising, 

5 contacting said cell with a nucleic acid construct under conditions such that said 

nucleic acid construct is integrated by said cell, wherein said nucleic acid construct 
comprises a passenger sequence and a marker sequence, wherein said marker sequence 
is flanked by a first and second recombining site wherein said recombining sites are 
oriented in the same direction. 

10 

2. The method of claim 1 , wherein said construct further comprises an origin of 
replication between said first and second recombining sites. 

3. The method of claim 2, wherein said origin of replication is, or is derived from, 
15 an origin of replication selected from the group consisting of pSClOlori, R6K-yori, 

colEl,and oriEV. 

4. The method of claim 1, wherein said nucleic acid further comprises a sequence 
selected from the group consisting of a promoter, a restriction site, an intron, an IRES 

20 element, and a polyadenylation site. 

5. The method of claim 1, wherein said nucleic acid further comprises a guide 
sequence thereby resulting in site-specific integration of said nucleic acid construct. 

25 6. The method of claim 1 , wherein said guide sequence, comprises genomic 
sequence derived from a replicating genome. 

7. The method of claim 1 , wherein said cell is provided with a site-specific 
recombinase that results in recombination between said first and second recombining 
30 sites, wherein said sequence flanked by said recombining sites is excised. 
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8. The method of claim 7, wherein said recombinase is selected from the group 
consisting of FLP, Xer, Int, and Cre. 

9. The method of claim 7, wherein said recombinase is FLP recombinase. 

5 

1 0. The method of claim 1 , wherein said recombining site is selected from the group 
consisting of FRT, dif, att, and loxP. 

1 1 . The method of claim 1, wherein said recombining site is FRT. 

10 

12. The method of claim 7, wherein following excision of said sequence, the method 
of claim 1 is repeated on said cell. 

13. The method of claim 1, wherein said construct further comprises a promoter 5' to 
1 5 the passenger sequence. 

14. The method of claim 1 , wherein said marker sequence is selected from the group 
consisting of a antibiotic resistance gene and a non-antibiotic resistance gene. 

20 15. The method of claim 14, wherein said antibiotic resistance gene is selected from 
the group consisting of the gentamycin resistance gene, zeocin resistance gene, 
kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, and 
chloramphenicol resistance gene. 

25 16. The method of claim 1 1 , wherein said non-antibiotic resistance gene is selected 
from the group consisting of an auxotrophic gene, a metal ion resistance gene, a trp 
gene, and a gene encoding a green fluorescent protein. 

1 7. The method of claim 1 , wherein said passenger sequence encodes at least one 
30 gene. 
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18. The method of claim 17, wherein said passenger sequence encodes a gene 
involved in ethanologenesis. 

1 9. The method of claim 1 8, wherein said gene involved in ethanologenesis is 
5 derived from an organism selected from the group consisting of prokaryotes and 

eukaryotes. 

20. The method of claim 1 8, wherein said gene is selected from the group consisting 
of adh and pdc. 

10 

2 1 . The method of claim 1 , wherein said host cell is a bacterial cell. 

22. The method of claim 1, wherein said host cell is a Gram-negative bacterial cell. 

15 23. The method of claim 22, wherein said host cell is a facultatively anaerobic 
bacterial cell. 

24. The method of claim 23, wherein said host cell is selected from the family 
Enterobacteriaceae. 

20 

25. The method of claim 24 wherein said host is selected from the group consisting 
of Escherichia and Klebsiella. 

26. The method of claim 24, wherein said host is E. coli B or E. coli K12. 

25 

27. The method of claim 1, wherein said host cell is a recombinant bacterial cell. 

28. The method of claim 1 , wherein said nucleic acid construct is, or is derived from, 
a plasmid selected from the group consisting of pLOI2223, pLOI2224, pLOI2225, 

30 pLOI2226, pLOI2227, pLOI2228, pLOI2231, and pLOI2403. 
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29. A method for producing a recombinant ethanologenic cell, said method 
comprising, 

contacting said cell with a nucleic acid construct under conditions in which 
integration of said nucleic acid construct occurs forming a recombinant ethanologenic 
5 cell, wherein said nucleic acid construct comprises a passenger sequence, wherein said 
passenger sequence comprises an ethanologenic gene, and a marker sequence, wherein 
said marker sequence is flanked by a first and second recombining site. 

30. The method of claim 29, wherein said construct further comprises an origin of 
10 replication between said first and second recombining site. 

3 1 . The method of claim 30, wherein said origin of replication is, or is derived from, 
an origin of replication selected from the group consisting of pSClOlori, R6K-yori, 
colElori, and oriEV. 

15 

32. The method of claim 29, wherein said cell is provided with a site-specific 
recombinase that results in recombination between said first and second recombining 
sites wherein said sequence flanked by said recombining sites is excised. 

20 33. The method of claim 29, wherein said recombinase is selected from the group 
consisting of FLP, Xer, Int, and Cre. 

34. The method of claim 33, wherein said recombinase is FLP recombinase. 

25 35. The method of claim 29, wherein said recombining site is selected from the 
group consisting of FRT, dif, att, and loxP. 

36. The method of claim 35, wherein said recombining site is FRT. 

30 37. The method of claim 29, wherein said ethanologenic gene is selected from the 
group consisting of adh and pdc. 
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38. The method of claim 29 where said passenger sequence encodes pdc, adhB, and 
cat. 

39. The method of claim 29, wherein said nucleic acid further comprises a guide 
5 sequence thereby resulting in site-specific integration of said nucleic acid construct. 

40. The method of claim 39, wherein said guide sequence is derived from a 
replicating genome. 

10 41 . The method of claim 29, wherein said marker sequence is selected from the 
group consisting of a antibiotic resistance gene or a non-antibiotic resistance gene. 

42. The method of claim 41 , wherein said antibiotic resistance gene is selected from 
the group consisting of gentamycin resistance gene, zeocin resistance gene, kanamycin 

1 5 resistance gene, ampicillin resistance gene, tetracycline resistance gene, and 
chloramphenicol resistance gene. 

43. The method of claim 41 , wherein said non-antibiotic resistance gene is selected 
from the group consisting of an auxotrophic gene, a metal ion resistance gene, a trp 

20 gene, and a gene encoding a green fluorescent protein. 

44. The method of claim 41 , where said marker sequence comprises the kanamycin 
resistance gene. 

25 45. A recombinant ethanologenic cell produced according to the method of claim 29. 

46. A recombinant host cell comprising, 

a nucleic acid construct that comprises a passenger sequence and a 
marker sequence, wherein said marker sequence is flanked by a first and second 
30 recombining site wherein said recombining sites are oriented in the same direction. 
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47. The host cell according to 46, wherein said construct further comprises an origin 
of replication between said first and second recombining sites. 

48. The host cell according to 46, wherein said origin of replication is, or is derived 
5 from, an origin of replication selected from the group consisting of pSClOlori, R6K- 

yori, colEl , and oriEV. 



49. The host cell according to claim 46, wherein said nucleic acid construct is 
integrated. 

10 

50. The host cell according to claim 46, wherein said nucleic acid further comprises 
a guide sequence thereby resulting in site-specific integration of said nucleic acid 
construct. 

15 51. The host cell according to claim 46, wherein said guide sequence, comprises 
genomic sequence derived from a replicating genome. 

52. The host cell according to claim 46, wherein said cell is provided with a site- 
specific recombinase that results in recombination between said first and second 

20 recombining sites, wherein said sequence flanked by said recombining sites is excised. 

53. The host cell according to claim 46, wherein said recombinase is selected from 
the group consisting of FLP, Xer, Int, and Cre. 

25 54. The host cell according to claim 46, wherein said recombinase is FLP 
recombinase. 

55. The host cell according to claim 46, wherein said recombining site is selected 
from the group consisting of FRT, dif, att, and loxP. 

30 



56. 



The host cell according to claim 46, wherein said recombining site is FRT. 
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57. The host cell according to claim 46, wherein said marker sequence is selected 
from the group consisting of a antibiotic resistance gene or a non-antibiotic resistance 
gene. 

5 58. The host cell according to claim 57, wherein said antibiotic resistance gene is 
selected from the group consisting of a gentamycin resistance gene, zeocin resistance 
gene, kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance 
gene, and chloramphenicol resistance gene. 

10 59. The host cell according to claim 57, wherein said non-antibiotic resistance gene 
is selected from the group consisting of an auxotrophic gene, a metal ion resistance 
gene, a trp gene, and a gene encoding a green fluorescent protein. 

60. The host cell according to claim 46, wherein said passenger sequence encodes a 
15 gene involved in ethanologenesis. 

61 . The host cell according to claim 46, wherein said passenger sequence is selected 
from the group comprising adh, pdc 9 and cat. 

20 62. The host cell according to claim 46, wherein said host cell is a bacterial cell. 

63. The host cell according to claim 46, wherein said host cell is a Gram-negative 
bacterial cell. 

25 64. The host cell according to claim 46, wherein said host cell is a facultatively 
anaerobic bacterial cell. 

65. The host cell according to claim 46, wherein said host cell is selected from the 
family Enterobacteriaceae. 

30 

66. The host cell according to claim 46, wherein said host is selected from the group 
consisting of Escherichia and Klebsiella. 
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67. The host cell according to claim 46, wherein said host is E. coli B or E. coli K12. 



68. The host cell according to claim 46, wherein said host cell is a recombinant 
bacterial cell. 

5 

69. The host cell according to claim 46, wherein said nucleic acid construct is, or is 
derived from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, 
pLOI2225, pLOI2226, pLOI2227, pLOI2228, pLOI2231, and pLOI2403. 

10 70. The host cell according to claim 46, wherein said cell is ethanologenic. 

71. A method for producing ethanol comprising, 

providing a recombinant ethanologenic cell having a nucleic acid 
construct that comprises a passenger sequence and a marker sequence, wherein said 
15 marker sequence is flanked by a first and second recombining site, and 

contacting said cell with a substrate which can be fermented into 
ethanol, wherein expression of said passenger sequence results in the production of 
ethanol. 

20 72. The method of claim 71 , wherein said construct further comprises an origin of 
replication between said first and second recombining sites. 

73. The method of claim 72, wherein said origin of replication is, or is derived from, 
an origin of replication selected from the group consisting of pSClOlori, R6K-yori, 

25 colEl,andoriEV. 

74. The method according to claim 71, wherein said nucleic acid construct is 
integrated. 

30 75. The method according to claim 71 , wherein said nucleic acid construct further 
comprises a guide sequence thereby resulting in site-specific integration of said nucleic 
acid construct. 
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76. The method according to claim 75, wherein said guide sequence, comprises 
genomic sequence derived from a replicating genome. 

77. The method according to claim 71, wherein said cell is provided with a site- 
5 specific recombinase that results in recombination between said first and second 

recombining site, wherein said sequence flanked by said recombining sites is excised. 

78. The method according to claim 71 , wherein said recombinase is selected from 
the group consisting of FLP, Xer, Int, and Cre. 

10 

79. The method according to claim 71 , wherein said recombinase is FLP 
recombinase. 

80. The method according to claim 71 , wherein said recombining site is selected 
15 from the group consisting of FRT, dif, att, and loxP. 

81 . The method according to claim 71, wherein said recombining site is FRT. 

82. The method according to claim 71, wherein said marker sequence is selected 
20 from the group consisting of a antibiotic resistance gene or a non-antibiotic resistance 

gene. 

83. The method of claim 82, wherein said antibiotic marker is selected from the 
group consisting of a gentamycin resistance gene, zeocin resistance gene, kanamycin 

25 resistance gene, ampicillin resistance gene, tetracycline resistance gene, and 
chloramphenicol resistance gene. 

84. The method of claim 82, wherein said non-antibiotic resistance gene is selected 
from the group consisting of an auxotrophic gene, a metal ion resistance gene, a trp 

30 gene, and a gene encoding a green fluorescent protein. 
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85. The method according to claim 71, wherein said passenger sequence encodes a 
gene involved in ethanologenesis. 

86. The method according to claim 71, wherein said passenger sequence is selected 
5 from the group comprising adh and pdc. 

87. The method according to claim 71, wherein said passenger sequence encodes 
adhB.pdc, and cat. 

10 88. The method according to claim 7 1 , wherein said host cell is a bacterial cell. 

89. The method according to claim 71, wherein said host cell is a Gram-negative 
bacterial cell. 

15 90. The method according to claim 7 1 , wherein said host cell is a facultatively 
anaerobic bacterial cell. 

9 1 . The method according to claim 7 1 , wherein said host cell is selected from the 
family Enterobacteriaceae. 

20 

92. The method according to claim 71, wherein said host is selected from the group 
consisting of Escherichia and Klebsiella. 

93. The method according to claim 71, wherein said host is £. coli B or E. coli K12. 

25 

94. The method according to claim 71, wherein said host cell is a recombinant 
bacterial cell. 



30 



95. The method according to claim 71, wherein said nucleic acid construct is, or is 
derived from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, 
pLOI2225, pLOI2226, pLOI2227, pLOI2228, pLOI2231, and pLOI2403. 
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96. A nucleic acid construct comprising, 

a passenger sequence, and 

a marker sequence, wherein said marker sequence is flanked by a first 
and second recombining site. 

5 

97. The construct according to claim 96, wherein said construct further comprises an 
origin of replication between said first and second recombining sites. 

98. The construct according to claim 97, wherein said origin of replication is, or is 
10 derived from, an origin of replication selected from the group consisting of pSClOlori, 

R6K-yori, colEl, and oriEV. 

99. The construct according to claim 84, wherein said construct further comprises a 
guide sequence. 

15 

100. The construct according to claim 99, wherein said guide sequence comprising a 
sequence derived from a replicating genome. 

101 . The construct according to claim 99, wherein said guide sequence is derived 
20 from a bacterial cell. 

1 02. The construct according to claim 96, wherein said construct further comprises a 
at least one unique restriction enzyme site. 

25 103. The construct according to claim 96, wherein said passenger sequence comprises 
an ethanologenic gene. 

1 04. The construct according to claim 96, wherein said passenger sequence comprises 
a gene selected from the group consisting of adh and pdc. 

30 

105. The construct according to claim 96, wherein said passenger sequence comprises 
adhB,pdc 9 and cat. 
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106. The construct according to claim 96, wherein said recombining site is selected 
from the group consisting of FRT, dif, att, and loxP. 

107. The construct according to claim 96, wherein said recombining site is FRT. 

5 

108. The construct according to claim 96, wherein said construct is exposed to a site- 
specific recombinase such that a recombination between said first and second 
recombining site occurs, wherein said sequence flanked by said recombining sites is 
excised. 

10 

109. The construct according to claim 108, wherein said recombinase is selected from 
the group consisting of FLP, Xer, Int, and Cre. 

1 10. The construct according to claim 109, wherein said recombinase is FLP. 

15 

111. The construct according to claim 96, wherein said passenger sequence comprises 
a sequence selected from the group consisting of a heterologous promoter and a 
prokaryotic termination sequence. 



20 112. The construct according to claim 96, wherein said marker sequence is selected 
from the group consisting of a antibiotic resistance gene or a non-antibiotic resistance 
gene. 

113. The construct according to claim 1 12, wherein said antibiotic resistance gene is 
25 selected from the group consisting of a gentamycin resistance gene, zeocin resistance 

gene, kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance 
gene, and chloramphenicol resistance gene. 

1 14. The construct according to claim 112, wherein said non-antibiotic resistance 

30 gene is selected from the group consisting of an auxotrophic resistance gene, a metal ion 
resistance gene, a trp gene, and a gene encoding a green fluorescent protein. 
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115. The construct according to claim 96, wherein said marker sequence comprises 
the kanamycin resistance gene. 

1 16. The construct according to claim 96, wherein said construct is, or is derived 

5 from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, pLOI2225, 
pLOI2226, pLOI2227, pLOI2228, pLOI2231, and pLOI2403. 

117. A kit comprising at least one nucleic acid construct according to claim 96 and 
instructions for use. 

10 
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SEQUENCE LISTING 

<110> University of Florida 

<120> METHODS AND COMPOSITIONS FOR THE CHROMOSOMAL 
INTEGRATION OF HETEROLOGOUS DNA 

<130> BCI-021PC 

<140> 
<141> 

<150> 09/390,479 
<151> 1999-09-07 

<160> 7 

<170> Patentln ver. 2.0 

<210> 1 
<211> 2427 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: cloning 
vectors 

<220> 

<223> Ampici 11 in- resistance gene from position 2295 to 
position 1435 

<400> 1 

atcgatgaat tgatccgaag ttcctattct ctagaaagta taggaacttc gaattgtcga 60 

caagcttgat ctggcttatc gaaattaata cgactcacta tagggagacc ggaattccga 
120 

gcttggcgcg ccaggtcgac tctagaggat ccccggggaa gatcttccgg aagatcttcc 
180 

cgagctcgaa ttaattccgc gatgaattga tcccggaagt tcctattctc tagaaagtat 
240 

aggaacttcg aattggtcga caagctagct tgcatgcaag cttgtattct atagtgtcac 
300 

ctaaatcgta tgtgtatgat acataaggtt atgtattaat tgtagccgcg ttctaacgac 
360 

aatatgtaca agcctaattg tgtagcatct ggcttactga agcagaccct atcatctctc 
420 

tcgtaaactg ccgtcagagt cggtttggtt ggacgaacct tctgagtttc tggtaacgcc 
480 
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gttccgcacc ccggaaatgg tcagcgaacc aatcagcagg gtcatcgcta gcccatggct 
540 

aattctgtca gccgttaagt gttcctgtgt cactgaaaat tgctttgaga ggctctaagg 
600 



gcttctcagt gcgttacatc cctggcttgt tgtccacaac cgttaaacct taaaagcttt 
660 

aaaagcctta tatattcttt tttttcttat aaaacttaaa accttagagg ctatttaagt 
720 

tgctgattta tattaatttt attgttcaaa catgagagct tagtacgtga aacatgagag 
780 

cttagtacgt tagccatgag agcttagtac gttagccatg agggtttagt tcgttaaaca 
840 

tgagagctta gtacgttaaa catgagagct tagtacgtga aacatgagag cttagtacgt 
900 

actatcaaca ggttgaactg cggatcttgc ggccgcaatc gggcaaatcg ctgaatattc 
960 

cttttgtctc cgaccatcag gcacctgagt cgctgtcttt ttcgtgacat tcagttcgct 
1020 

gcgctcacgg ctctggcagt gaatgggggt aaatggcact acaggcgcct tttatggatt 
1080 

catgcaagga aactcaaaat aatacaagaa aagcccgtca cgggcttctc agggcgtttt 
1140 

atggcgggtc tgctatgtgg tgctatctga ctttttgctg ttcagcagtt cctgccctct 
1200 

gattttccag tctgaccact tcggattatc ccgtgacagg tcattcagac tggctaatgc 
1260 

acccagtaag gcagcggtat catcaacggg gtctgacgct cagtggaacg aaaactcacg 
1320 

ttaagggatt ttggtcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta 
1380 

aaaatgaagt tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttacca 
1440 

atgcttaatc agtgaggcac ctatctcagc gatctgtcta tttcgttcat ccatagttgc 
1500 

ctgactcccc gtcgtgtaga taactacgat acgggagggc ttaccatctg gccccagtgc 
1560 

tgcaatgata ccgcgagacc cacgctcacc ggctccagat ttatcagcaa taaaccagcc 
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1620 

agccggaagg gccgagcgca gaagtggtcc tgcaacttta tccgcctcca tccagtctat 
1680 

taattgttgc cgggaagcta gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt 
1740 

tgccattgct acaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc 
1800 

cggttcccaa cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa aagcggttag 
1860 

ctccttcggt cctccgatcg ttgtcagaag taagttggcc gcagtgttat cactcatggt 
1920 

tatggcagca ctgcataatt ctcttactgt catgccatcc gtaagatgct tttctgtgac 
1980 

tggtgagtac tcaaccaagt cattctgaga atagtgtatg cggcgaccga gttgctcttg 
2040 

cccggcgtca atacgggata ataccgcgcc acatagcaga actttaaaag tgctcatcat 
2100 

tggaaaacgt tcttcggggc gaaaactctc aaggatctta ccgctgttga gatccagttc 
2160 

gatgtaaccc actcgtgcac ccaactgatc ttcagcatct tttactttca ccagcgtttc 
2220 

tgggtgagca aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa 
2280 

atgttgaata ctcatactct tcctttttca atattattga agcatttatc agggttattg 
2340 

tctcatgagc ggatacatat ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg 
2400 

cacatttccc cgaaaagtgc cacttgc 
2427 



<210> 2 

<211> 1929 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: cloning 
vectors 



<220> 
<223> 



Kanamycin- resistance gene fron position 1778 to 
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position 984 
<400> 2 

atcgatgaat tgatccgaag ttcctattct ctagaaagta taggaacttc gaattgtcga 60 

caagcttgat ctggcttatc gaaattaata cgactcacta tagggagacc ggaattccga 
120 

gcttggcgcg ccaggtcgac tctagaggat ccccggggaa gatcttccgg aagatcttcc 
180 

cgagctcgaa ttaattccgc gatgaattga tcccggaagt tcctattctc tagaaagtat 
240 

aggaacttcg aattggtcga caagctagct tgcatgcaag cttgtattct atagtgtcac 
300 

ctaaatcgta tgtgtatgat acataaggtt atgtattaat tgtagccgcg ttctaacgac 
360 

aatatgtaca agcctaattg tgtagcatct ggcttactga agcagaccct atcatctctc 
420 

tcgtaaactg ccgtcagagt cggtttggtt ggacgaacct tctgagtttc tggtaacgcc 
480 

gttccgcacc ccggaaatgg tcagcgaacc aatcagcagg gtcatcgcta gcccatggct 
540 

aattctgtca gccgttaagt gttcctgtgt cactgaaaat tgctttgaga ggctctaagg 
600 

gcttctcagt gcgttacatc cctggcttgt tgtccacaac cgttaaacct taaaagcttt 
660 

aaaagcctta tatattcttt tttttcttat aaaacttaaa accttagagg ctatttaagt 
720 

tgctgattta tattaatttt attgttcaaa catgagagct tagtacgtga aacatgagag 
780 

cttagtacgt tagccatgag agcttagtac gttagccatg agggtttagt tcgttaaaca 
840 

tgagagctta gtacgttaaa catgagagct tagtacgtga aacatgagag cttagtacgt 
900 

actatcaaca ggttgaactg cggatcttgc ggccgcaaaa attaaaaatg aagttttgac 
960 

ggtatcgaac cccagagtcc cgctcagaag aactcgtcaa gaaggcgata gaaggcgatg 
1020 

cgctgcgaat cgggagcggc gataccgtaa agcacgagga agcggtcagc ccattcgccg 
1080 
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ccaagctctt cagcaatatc acgggtagcc aacgctatgt cctgatagcg gtccgccaca 
1140 

cccagccggc cacagtcgat gaatccagaa aagcggccat tttccaccat gatattcggc 
1200 

aagcaggcat cgccatgggt cacgacgaga tcctcgccgt cgggcatccg cgccttgagc 
1260 

ctggcgaaca gttcggctgg cgcgagcccc tgatgctctt cgtccagatc atcctgatcg 
1320 

acaagaccgg cttccatccg agtacgtgct cgctcgatgc gatgtttcgc ttggtggtcg 
1380 

aatgggcagg tagccggatc aagcgtatgc agccgccgca ttgcatcagc catgatggat 

1440 

actttctcgg caggagcaag gtgagatgac aggagatcct gccccggcac ttcgcccaat 
1500 

agcagccagt cccttcccgc ttcagtgaca acgtcgagca cagctgcgca aggaacgccc 
1560 

gtcgtggcca gccacgatag ccgcgctgcc tcgtcttgga gttcattcag ggcaccggac 
1620 

aggtcggtct tgacaaaaag aaccgggcgc ccctgcgctg acagccggaa cacggcggca 
1680 

tcagagcagc cgattgtctg ttgtgcccag tcatagccga atagcctctc cacccaagcg 
1740 

gccggagaac ctgcgtgcaa tccatcttgt tcaatcatgc gaaacgatcc tcatcctgtc 
1800 

tcttgatcca ctagattatt gaagcattta tcagggttat tgtctcatga gcggatacat 
1860 

atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt 
1920 

gccacctgc 
1929 



<210> 3 
<211> 2003 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: cloning 
vectors 



<220> 
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<223> chloramphenicol-resistance gene from position 1704 
to position 1045 

<400> 3 

atcgatgaat tgatccgaag ttcctattct ctagaaagta taggaacttc gaattgtcga 60 

caagcttgat ctggcttatc gaaattaata cgactcacta tagggagacc ggaattccga 
120 

gcttggcgcg ccaggtcgac tctagaggat ccccggggaa gatcttccgg aagatcttcc 
180 

cgagctcgaa ttaattccgc gatgaattga tcccggaagt tcctattctc tagaaagtat 
240 

aggaacttcg aattggtcga caagctagct tgcatgcaag cttgtattct atagtgtcac 
300 

ctaaatcgta tgtgtatgat acataaggtt atgtattaat tgtagccgcg ttctaacgac 
360 

aatatgtaca agcctaattg tgtagcatct ggcttactga agcagaccct atcatctctc 
420 

tcgtaaactg ccgtcagagt cggtttggtt ggacgaacct tctgagtttc tggtaacgcc 
480 

gttccgcacc ccggaaatgg tcagcgaacc aatcagcagg gtcatcgcta gcccatggct 
540 

aattctgtca gccgttaagt gttcctgtgt cactgaaaat tgctttgaga ggctctaagg 
600 

gcttctcagt gcgttacatc cctggcttgt tgtccacaac cgttaaacct taaaagcttt 
660 

aaaagcctta tatattcttt tttttcttat aaaacttaaa accttagagg ctatttaagt 
720 

tgctgattta tattaatttt attgttcaaa catgagagct tagtacgtga aacatgagag 
780 

cttagtacgt tagccatgag agcttagtac gttagccatg agggtttagt tcgttaaaca 
840 

tgagagctta gtacgttaaa catgagagct tagtacgtga aacatgagag cttagtacgt 
900 

actatcaaca ggttgaactg cggatcttgc ggccgcaaaa attaaaaatg aagttttaaa 
960 

tcaatctaaa gtatatatga gtaaacttgg tctgacagtt accaatgctt aatcagtgag 
1020 

gcaccaataa ctgccttaaa aaaattacgc cccgccctgc cactcatcgc agtactgttg 
1080 



WO 01/18222 



PCTAJSOO/22700 



- 7 - 



taattcatta agcattctgc cgacatggaa gccatcacag acggcatgat gaacctgaat 
1140 

cgccagcggc atcagcacct tgtcgccttg cgtataatat ttgcccatgg tgaaaacggg 
1200 

ggcgaagaag ttgtccatat tggccacgtt taaatcaaaa ctggtgaaac tcacccaggg 
1260 

attggctgag acgaaaaaca tattctcaat aaacccttta gggaaatagg ccaggttttc 
1320 

accgtaacac gccacatctt gcgaatatat gtgtagaaac tgccggaaat cgtcgtggta 
1380 

ttcactccag agcgatgaaa acgtttcagt ttgctcatgg aaaacggtgt aacaagggtg 
1440 

aacactatcc catatcacca gctcaccgtc tttcattgcc atacggaatt tcggatgagc 
1500 

attcatcagg cgggcaagaa tgtgaataaa ggccggataa aacttgtgct tatttttctt 
1560 

tacggtcttt aaaaaggccg taatatccag ctgaacggtc tggttatagg tacattgagc 
1620 

aactgactga aatgcctcaa aatgttcttt acgatgccat tgggatatat caacggtggt 
1680 

atatccagtg atttttttct ccattttagc ttccttagct cctgaaaatc tcgataactc 
1740 

aaaaaatacg cccggtagtg atcttatttc attatggtga aagttggaac ctcttacgtg 
1800 

ccgatcaacg tctcattttc gccaaaagtt ggcccagggc ttcccggtat caacagggac 
1860 

accaggattt atttattctg cgaagtgatc ttccgtcaca ggtatttatt cggcgcaaag 
1920 

tgcgtcgggt gatgctgcca acttactgat ttagtgtatg atggtgtttt tgaggtgctc 
1980 

cagtggcttc tgtttctatc age 
2003 



<210> 4 

<211> 3246 

<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: cloning 
vectors 

<220> 

<223> Ampicillin-resistance gene from position 3114 to 
position 2254 

<400> 4 

atcgatgaat tgatccgaag ttcctattct ctagaaagta taggaacttc gaattgtcga 60 

caagcttgat ctggcttatc gaaattaata cgactcacta tagggagacc ggaattccga 
120 

gcttggcgcg ccaggtcgac tctagaggat ccccggggaa gatcttccgg aagatcttcc 
180 

cgagctcgaa ttaattccgc gatgaattga tcccggaagt tcctattctc tagaaagtat 
240 

aggaacttcg aattggtcga caagctagct tgcatgcaag cttggcactg gctgatcagc 
300 

tagcccatgg gtatggacag ttttcccttt gatatgtaac ggtgaacagt tgttctactt 
360 

ttgtttgtta gtcttgatgc ttcactgata gatacaagag ccataagaac ctcagatcct 
420 

tccgtattta gccagtatgt tctctagtgt ggttcgttgt ttttgcgtga gccatgagaa 
480 

cgaaccattg agatcatgct tactttgcat gtcactcaaa aattttgcct caaaactggt 
540 

gagctgaatt tttgcagtta aagcatcgtg tagtgttttt cttagtccgt tacgtaggta 
600 

ggaatctgat gtaatggttg ttggtatttt gtcaccattc atttttatct ggttgttctc 
660 

aagttcggtt acgagatcca tttgtctatc tagttcaact tggaaaatca acgtatcagt 
720 

cgggcggcct cgcttatcaa ccaccaattt catattgctg taagtgttta aatctttact 
780 

tattggtttc aaaacccatt ggttaagcct tttaaactca tggtagttat tttcaagcat 
840 

taacatgaac ttaaattcat caaggctaat ctctatattt gccttgtgag ttttcttttg 
900 

tgttagttct tttaataacc actcataaat cctcatagag tatttgtttt caaaagactt 
960 



aacatgttcc agattatatt ttatgaattt ttttaactgg aaaagataag gcaatatctc 
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1020 



ttcactaaaa actaattcta atttttcgct tgagaacttg gcatagtttg tccactggaa 
1080 

aatctcaaag cctttaacca aaggattcct gatttccaca gttctcgtca tcagctctct 
1140 



ggttgcttta gctaatacac cataagcatt ttccctactg atgttcatca tctgaacgta 
1200 



ttggttataa gtgaacgata ccgtccgttc tttccttgta gggttttcaa tcgtggggtt 
1260 

gagtagtgcc acacagcata aaattagctt ggtttcatgc tccgttaagt catagcgact 
1320 



aatcgctagt tcatttgctt tgaaaacaac taattcagac atacatctca attggtctag 
1380 

gtgattttaa tcactatacc aattgagatg ggctagtcaa tgataattac tagtcctttt 
1440 

cctttgagtt gtgggtatct gtaaattctg ctagaccttt gctggaaaac ttgtaaattc 
1500 



tgctagaccc tctgtaaatt ccgctagacc tttgtgtgtt ttttttgttt atattcaagt 
1560 



ggttataatt tatagaataa agaaagaata aaaaaagata aaaagaatag atcccagccc 
1620 



tgtgtataac tcactacttt agtcagttcc gcagtattac aaaaggatgt cgcaaacgct 
1680 



gtttgctcct ctacaaaaca gaccttaaaa ccctaaaggc ttaagtagca ccctcgcaag 
1740 



ctcggttgcg gccgcaatcg ggcaaatcgc tgaatattcc ttttgtctcc gaccatcagg 
1800 



cacctgagtc gctgtctttt tcgtgacatt cagttcgctg cgctcacggc tctggcagtg 
I860 



aatgggggta aatggcacta caggcgcctt ttatggattc atgcaaggaa actacccata 
1920 



atacaagaaa agcccgtcac gggcttctca gggcgtttta tggcgggtct gctatgtggt 
1980 



gctatctgac tttttgctgt tcagcagttc ctgccctctg attttccagt ctgaccactt 
2040 



cggattatcc cgtgacaggt cattcagact ggctaatgca cccagtaagg cagcggtatc 
2100 
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atcaacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 
2160 

attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 
2220 

ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc 
2280 

tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat 
2340 

aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc 
2400 

acgctcaccg gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag 
2460 

aagtggtcct gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag 
2520 

agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt 
2580 

ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg 
2640 

agttacatga tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt 
2700 

tgtcagaagt aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc 
2760 

tcttactgtc atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc 
2820 

attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa 
2880 

taccgcgcca catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg 

2940 

aaaactctca aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc 
3000 

caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag 
3060 

gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt 
3120 

cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt 
3180 

tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc 
3240 
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acttgc 
3246 



<210> 5 
<211> 3443 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: cloning 
vectors 

<220> 

<223> kanamycin- resistance gene from position 3292 to 
position 2498 

<400> 5 

atcgatgaat tgatccgaag ttcctattct ctagaaagta taggaacttc gaattgtcga 60 

caagcttgat ctggcttatc gaaattaata cgactcacta tagggagacc ggaattccga 
120 

gcttggcgcg ccaggtcgac tctagaggat ccccggggaa gatcttccgg aagatcttcc 
180 

cgagctcgaa ttaattccgc gatgaattga tcccggaagt tcctattctc tagaaagtat 
240 

aggaacttcg aattggtcga caagctagct tgcatgcaag cttggcactg gctgatcagc 
300 

tagcccatgg gtatggacag ttttcccttt gatatgtaac ggtgaacagt tgttctactt 
360 

ttgtttgtta gtcttgatgc ttcactgata gatacaagag ccataagaac ctcagatcct 
420 

tccgtattta gccagtatgt tctctagtgt ggttcgttgt ttttgcgtga gccatgagaa 
480 

cgaaccattg agatcatgct tactttgcat gtcactcaaa aattttgcct caaaactggt 
540 

gagctgaatt tttgcagtta aagcatcgtg tagtgttttt cttagtccgt tacgtaggta 
600 

ggaatctgat gtaatggttg ttggtatttt gtcaccattc atttttatct ggttgttctc 
660 

aagttcggtt acgagatcca tttgtctatc tagttcaact tggaaaatca acgtatcagt 
720 

cgggcggcct cgcttatcaa ccaccaattt catattgctg taagtgttta aatctttact 
780 
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tattggtttc aaaacccatt ggttaagcct tttaaactca tggtagttat tttcaagcat 
840 

taacatgaac ttaaattcat caaggctaat ctctatattt gccttgtgag ttttcttttg 
900 

tgttagttct tttaataacc actcataaat cctcatagag tatttgtttt caaaagactt 
960 

aacatgttcc agattatatt ttatgaattt ttttaactgg aaaagataag gcaatatctc 
1020 

ttcactaaaa actaattcta atttttcgct tgagaacttg gcatagtttg tccactggaa 
1080 

aatctcaaag cctttaacca aaggattcct gatttccaca gttctcgtca tcagctctct 

1140 

ggttgcttta gctaatacac cataagcatt ttccctactg atgttcatca tctgaacgta 
1200 

ttggttataa gtgaacgata ccgtccgttc tttccttgta gggttttcaa tcgtggggtt 
1260 

gagtagtgcc acacagcata aaattagctt ggtttcatgc tccgttaagt catagcgact 
1320 

aatcgctagt tcatttgctt tgaaaacaac taattcagac atacatctca attggtctag 
1380 

gtgattttaa tcactatacc aattgagatg ggctagtcaa tgataattac tagtcctttt 
1440 

cctttgagtt gtgggtatct gtaaattctg ctagaccttt gctggaaaac ttgtaaattc 
1500 

tgctagaccc tctgtaaatt ccgctagacc tttgtgtgtt ttttttgttt atattcaagt 
1560 

ggttataatt tatagaataa agaaagaata aaaaaagata aaaagaatag atcccagccc 
1620 

tgtgtataac tcactacttt agtcagttcc gcagtattac aaaaggatgt cgcaaacgct 
1680 

gtttgctcct ctacaaaaca gaccttaaaa ccctaaaggc ttaagtagca ccctcgcaag 
1740 

ctcggttgcg gccgcaatcg ggcaaatcgc tgaatattcc ttttgtctcc gaccatcagg 
1800 

cacctgagtc gctgtctttt tcgtgacatt cagttcgctg cgctcacggc tctggcagtg 
1860 

aatgggggta aatggcacta caggcgcctt ttatggattc atgcaaggaa actacccata 
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1920 

atacaagaaa agcccgtcac 
1980 

gctatctgac tttttgctgt 
2040 

cggattatcc cgtgacaggt 
2100 

atcaacgggg tctgacgctc 
2160 

attatcaaaa aggatcttca 
2220 

ctaaagtata tatgagtaaa 
2280 

tatctcagcg atctgtctat 
2340 

aactacgata cgggagggct 
2400 

acgctcaccg gctccagatt 
2460 

aatgaagttt tgacggtatc 
2520 

gatagaaggc gatgcgctgc 
2580 

cagcccattc gccgccaagc 
2640 

agcggtccgc cacacccagc 
2700 

ccatgatatt cggcaagcag 
2760 

tccgcgcctt gagcctggcg 
2820 

gatcatcctg atcgacaaga 
2880 

tcgcttggtg gtcgaatggg 
2940 

cagccatgat ggatactttc 
3000 
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gggcttctca gggcgtttta 
tcagcagttc ctgccctctg 
cattcagact ggctaatgca 
agtggaacga aaactcacgt 
cctagatcct tttaaattaa 
cttggtctga cagttaccaa 
ttcgttcatc catagttgcc 
taccatctgg ccccagtgct 
tatcagcaat aaaccagcca 
gaaccccaga gtcccgctca 
gaatcgggag cggcgatacc 
tcttcagcaa tatcacgggt 
cggccacagt cgatgaatcc 
gcatcgccat gggtcacgac 
aacagttcgg ctggcgcgag 
ccggcttcca tccgagtacg 
caggtagccg gatcaagcgt 
tcggcaggag caaggtgaga 



PCT/USOO/22700 

tggcgggtct gctatgtggt 
attttccagt ctgaccactt 
cccagtaagg cagcggtatc 
taagggattt tggtcatgag 
aaatgaagtt ttaaatcaat 
tgcttaatca gtgaggcacc 
tgactccccg tcgtgtagat 
gcaatgatac cgcgagaccc 
gccgggccgc aaaaattaaa 
gaagaactcg tcaagaaggc 
gtaaagcacg aggaagcggt 
agccaacgct atgtcctgat 
agaaaagcgg ccattttcca 
gagatcctcg ccgtcgggca 
cccctgatgc tcttcgtcca 
tgctcgctcg atgcgatgtt 
atgcagccgc cgcattgcat 
tgacaggaga tcctgccccg 
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gcacttcgcc caatagcagc cagtcccttc ccgcttcagt gacaacgtcg agcacagctg 
3060 

cgcaaggaac gcccgtcgtg gccagccacg atagccgcgc tgcctcgtct tggagttcat 
3120 

tcagggcacc ggacaggtcg gtcttgacaa aaagaaccgg gcgcccctgc gctgacagcc 
3180 

ggaacacggc ggcatcagag cagccgattg tctgttgtgc ccagtcatag ccgaatagcc 
3240 

tctccaccca agcggccgga gaacctgcgt gcaatccatc ttgttcaatc atgcgaaacg 
3300 

atcctcatcc tgtctcttga tccactagat tattgaagca tttatcaggg ttattgtctc 
3360 

atgagcggat acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca 
3420 

tttccccgaa aagtgccacc tgc 
3443 



<210> 6 
<211> 3508 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: cloning 
vectors 

<220> 

<223> chloramphenicol -resistance gene from position 3209 
to position 2550 

<400> 6 

atcgatgaat tgatccgaag ttcctattct ctagaaagta taggaacttc gaattgtcga 60 

caagcttgat ctggcttatc gaaattaata cgactcacta tagggagacc ggaattccga 
120 

gcttggcgcg ccaggtcgac tctagaggat ccccggggaa gatcttccgg aagatcttcc 
180 

cgagctcgaa ttaattccgc gatgaattga tcccggaagt tcctattctc tagaaagtat 
240 

aggaacttcg aattggtcga caagctagct tgcatgcaag cttggcactg gctgatcagc 
300 

tagcccatgg gtatggacag ttttcccttt gatatgtaac ggtgaacagt tgttctactt 
360 
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ttgtttgtta gtcttgatgc ttcactgata gatacaagag ccataagaac ctcagatcct 
420 

tccgtattta gccagtatgt tctctagtgt ggttcgttgt ttttgcgtga gccatgagaa 
480 

cgaaccattg agatcatgct tactttgcat gtcactcaaa aattttgcct caaaactggt 
540 

gagctgaatt tttgcagtta aagcatcgtg tagtgttttt cttagtccgt tacgtaggta 
600 

ggaatctgat gtaatggttg ttggtatttt gtcaccattc atttttatct ggttgttctc 
660 

aagttcggtt acgagatcca tttgtctatc tagttcaact tggaaaatca acgtatcagt 
720 

cgggcggcct cgcttatcaa ccaccaattt catattgctg taagtgttta aatctttact 
780 

tattggtttc aaaacccatt ggttaagcct tttaaactca tggtagttat tttcaagcat 
840 

taacatgaac ttaaattcat caaggctaat ctctatattt gccttgtgag ttttcttttg 
900 

tgttagttct tttaataacc actcataaat cctcatagag tatttgtttt caaaagactt 
960 

aacatgttcc agattatatt ttatgaattt ttttaactgg aaaagataag gcaatatctc 
1020 

ttcactaaaa actaattcta atttttcgct tgagaacttg gcatagtttg tccactggaa 
1080 

aatctcaaag cctttaacca aaggattcct gatttccaca gttctcgtca tcagctctct 
1140 

ggttgcttta gctaatacac cataagcatt ttccctactg atgttcatca tctgaacgta 
1200 

ttggttataa gtgaacgata ccgtccgttc tttccttgta gggttttcaa tcgtggggtt 
1260 

gagtagtgcc acacagcata aaattagctt ggtttcatgc tccgttaagt catagcgact 
1320 

aatcgctagt tcatttgctt tgaaaacaac taattcagac atacatctca attggtctag 
1380 

gtgattttaa tcactatacc aattgagatg ggctagtcaa tgataattac tagtcctttt 
1440 

cctttgagtt gtgggtatct gtaaattctg ctagaccttt gctggaaaac ttgtaaattc 
1500 
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tgctagaccc tctgtaaatt ccgctagacc tttgtgtgtt ttttttgttt atattcaagt 
1560 

ggttataatt tatagaataa agaaagaata aaaaaagata aaaagaatag atcccagccc 
1620 

tgtgtataac tcactacttt agtcagttcc gcagtattac aaaaggatgt cgcaaacgct 
1680 

gtttgctcct ctacaaaaca gaccttaaaa ccctaaaggc ttaagtagca ccctcgcaag 
1740 

ctcggttgcg gccgcaatcg ggcaaatcgc tgaatattcc ttttgtctcc gaccatcagg 
1800 

cacctgagtc gctgtctttt tcgtgacatt cagttcgctg cgctcacggc tctggcagtg 
1860 

aatgggggta aatggcacta caggcgcctt ttatggattc atgcaaggaa actacccata 
1920 

atacaagaaa agcccgtcac gggcttctca gggcgtttta tggcgggtct gctatgtggt 
1980 

gctatctgac tttttgctgt tcagcagttc ctgccctctg attttccagt ctgaccactt 
2040 

cggattatcc cgtgacaggt cattcagact ggctaatgca cccagtaagg cagcggtatc 
2100 

atcaacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 
2160 

attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 
2220 

ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc 
2280 

tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat 
2340 

aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc 
2400 

acgctcaccg gctccagatt tatcagcaat aaaccagcca gccgaattaa aaatgaagtt 
2460 

ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca 
2520 

gtgaggcacc aataactgcc ttaaaaaaat tacgccccgc cctgccactc atcgcagtac 
2580 

tgttgtaatt cattaagcat tctgccgaca tggaagccat cacagacggc atgatgaacc 
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tgaatcgcca gcggcatcag caccttgtcg ccttgcgtat aatatttgcc catggtgaaa 
2700 

acgggggcga agaagttgtc catattggcc acgtttaaat caaaactggt gaaactcacc 
2760 

cagggattgg ctgagacgaa aaacatattc tcaataaacc ctttagggaa ataggccagg 
2820 

ttttcaccgt aacacgccac atcttgcgaa tatatgtgta gaaactgccg gaaatcgtcg 
2880 

tggtattcac tccagagcga tgaaaacgtt tcagtttgct catggaaaac ggtgtaacaa 
2940 

gggtgaacac tatcccatat caccagctca ccgtctttca ttgccatacg gaatttcgga 
3000 

tgagcattca tcaggcgggc aagaatgtga ataaaggccg gataaaactt gtgcttattt 
3060 

ttctttacgg tctttaaaaa ggccgtaata tccagctgaa cggtctggtt ataggtacat 
3120 

tgagcaactg actgaaatgc ctcaaaatgt tctttacgat gccattggga tatatcaacg 
3180 

gtggtatatc cagtgatttt tttctccatt ttagcttcct tagctcctga aaatctcgat 
3240 

aactcaaaaa atacgcccgg tagtgatctt atttcattat ggtgaaagtt ggaacctctt 
3300 

acgtgccgat caacgtctca ttttcgccaa aagttggccc agggcttccc ggtatcaaca 
3360 

gggacaccag gatttattta ttctgcgaag tgatcttccg tcacaggtat ttattcggcg 
3420 

caaagtgcgt cgggtgatgc tgccaactta ctgatttagt gtatgatggt gtttttgagg 
3480 

tgctccagtg gcttctgttt ctatcagc 
3508 



<210> 7 

<211> 9460 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: cloning 
vectors 



WO 01/18222 



- 18 - 



PCT/US00/22700 



<220> 

<223> adhE gene from E.coli from position 1890 to position 4565 
<220> 

<223> pdc gene from Z.mobilis from position 4701 to position 6380 
<220> 

<223> adhB gene from Z.mobilis from position 6508 to position 7659 
<220> 

<223> chloramphenicol -resistance gene from position 8515 to position 
9174 

<220> 

<223> kanamycin-resistance gene from position 152 to position 946 
<400> 7 

gcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata 60 
cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatctagt 
120 

ggatcaagag acaggatgag gatcgtttcg catgattgaa caagatggat tgcacgcagg 
180 

ttctccggcc gcttgggtgg agaggctatt cggctatgac tgggcacaac agacaatcgg 
240 

ctgctctgat gccgccgtgt tccggctgtc agcgcagggg cgcccggttc tttttgtcaa 
300 

gaccgacctg tccggtgccc tgaatgaact ccaagacgag gcagcgcggc tatcgtggct 
360 

ggccacgacg ggcgttcctt gcgcagctgt gctcgacgtt gtcactgaag cgggaaggga 
420 

ctggctgcta ttgggcgaag tgccggggca ggatctcctg tcatctcacc ttgctcctgc 
480 

cgagaaagta tccatcatgg ctgatgcaat gcggcggctg catacgcttg atccggctac 
540 

ctgcccattc gaccaccaag cgaaacatcg catcgagcga gcacgtactc ggatggaagc 
600 



WO 01/18222 PCTAJSOO/22700 

- 19 - 

cggtcttgtc gatcaggatg atctggacga agagcatcag gggctcgcgc cagccgaact 
660 

gttcgccagg ctcaaggcgc ggatgcccga cggcgaggat ctcgtcgtga cccatggcga 
720 

tgcctgcttg ccgaatatca tggtggaaaa tggccgcttt tctggattca tcgactgtgg 
780 

ccggctgggt gtggcggacc gctatcagga catagcgttg gctacccgtg atattgctga 
840 

agagcttggc ggcgaatggg ctgaccgctt cctcgtgctt tacggtatcg ccgctcccga 
900 

ttcgcagcgc atcgccttct atcgccttct tgacgagttc ttctgagcgg gactctgggg 
960 

ttcgataccg tcaaaacttc atttttaatt tttgcggccg caagatccgc agttcaacct 
1020 

gttgatagta cgtactaagc tctcatgttt cacgtactaa gctctcatgt ttaacgtact 
1080 

aagctctcat gtttaacgaa ctaaaccctc atggctaacg tactaagctc tcatggctaa 
1140 

cgtactaagc tctcatgttt cacgtactaa gctctcatgt ttgaacaata aaattaatat 
1200 

aaatcagcaa cttaaatagc ctctaaggtt ttaagtttta taagaaaaaa aagaatatat 
1260 

aaggctttta aagcttttaa ggtttaacgg ttgtggacaa caagccaggg atgtaacgca 
1320 

ctgagaagcc cttagagcct ctcaaagcaa ttttcagtga cacaggaaca cttaacggct 
1380 

gacagaatta gccatgggct agcgatgacc ctgctgattg gttcgctgac catttccggg 
1440 
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gtgcggaacg gcgttaccag aaactcagaa ggttcgtcca accaaaccga ctctgacggc 
1500 

agtttacgag agagatgata gggtctgctt cagtaagcca gatgctacac aattaggctt 
1560 

gtacatattg tcgttagaac gcggctacaa ttaatacata accttatgta tcatacacat 
1620 

acgatttagg tgacactata gaatacaagc ttgcatgcaa gctagcttgt cgaccaattc 
1680 

gagttcctat actttctaga gaataggaac ttccgggatc aattcatcgc ggaattaatt 
1740 

cgagctcggg aagatcttcc ggaagatctt ccccggggat cctctagagt cgacctggcg 
1800 

cgccccttga ctagagggtc gacgcatgcc tgtacatccg gagacgcgtc acggccgaag 
1860 

ctagcgaatt cgcccttaat tgctcttcca tggctgttac taatgtcgct gaacttaacg 
1920 

cactcgtaga gcgtgtaaaa aaagcccagc gtgaatatgc cagtttcact caagagcaag 
1980 

tagacaaaat cttccgcgcc gccgctctgg ctgctgcaga tgctcgaatc ccactcgcga 
2040 

aaatggccgt tgccgaatcc ggcatgggta tcgtcgaaga taaagtgatc aaaaaccact 
2100 

ttgcttctga atatatctac aacgcctata aagatgaaaa aacctgtggt gttctgtctg 
2160 

aagacgacac ttttggtacc atcactatcg ctgaaccaat cggtattatt tgcggtatcg 
2220 

ttccgaccac taacccgact tcaactgcta tcttcaaatc gctgatcagt ctgaagaccc 
2280 
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gtaacgccat tatcttctcc ccgcacccgc gtgcaaaaga tgccaccaac aaagcggctg 
2340 

atatcgttct gcaggctgct atcgctgccg gtgctccgaa agatctgatc ggctggatcg 
2400 

atcaaccttc tgttgaactg tctaacgcac tgatgcacca cccagacatc aacctgatcc 
2460 

tcgcgactgg tggtccgggc atggttaaag ccgcatacag ctccggtaaa ccagctatcg 
2520 

gtgtaggcgc gggcaacact ccagttgtta tcgatgaaac tgctgatatc aaacgtgcag 
2580 

ttgcatctgt actgatgtcc aaaaccttcg acaacggcgt aatctgtgct tctgaacagt 
2640 

ctgttgttgt tgttgactct gtttatgacg ctgtacgtga acgttttgca acccacggcg 
2700 

gctatctgtt gcagggtaaa gagctgaaag ctgttcagga tgttatcctg aaaaacggtg 
2760 

cgctgaacgc ggctatcgtt ggtcagccag cctataaaat tgctgaactg gcaggcttct 
2820 

ctgtaccaga aaacaccaag attctgatcg gtgaagtgac cgttgttgat gaaagcgaac 
2880 

cgttcgcaca tgaaaaactg tccccgactc tggcaatgta ccgcgctaaa gatttcgaag 
2940 

acgcggtaga aaaagcagag aaactggttg ctatgggcgg tatcggtcat acctcttgcc 
3000 

tgtacactga ccaggataac caaccggctc gcgtttctta cttcggtcag aaaatgaaaa 
3060 

cggcgcgtat cctgattaac accccagcgt ctcagggtgg tatcggtgac ctgtataact 
3120 
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tcaaactcgc accttccctg actctgggtt gtggttcttg gggtggtaac tccatctctg 
3180 

aaaacgttgg tccgaaacac ctgatcaaca agaaaaccgt tgctaagcga gctgaaaaca 
3240 

tgttgtggca caaacttccg aaatctatct acttccgccg tggctccctg ccaatcgcgc 
3300 

tggatgaagt gattactgat ggccacaaac gtgcgctcat cgtgactgac cgcttcctgt 
3360 

tcaacaatgg ttatgctgat cagatcactt ccgtactgaa agcagcaggc gttgaaactg 
3420 

aagtcttctt cgaagtagaa gcggacccga ccctgagcat cgttcgtaaa ggtgcagaac 
3480 

tggcaaactc cttcaaacca gacgtgatta tcgcgctggg tggtggttcc ccgatggacg 
3540 

ccgcgaagat catgtgggtt atgtacgaac atccggaaac tcacttcgaa gagctggcgc 
3600 

tgcgctttat ggatatccgt aaacgtatct acaagttccc gaaaatgggc gtgaaagcga 
3660 

aaatgatcgc tgtcaccacc acttctggta caggttctga agtcactccg tttgcggttg 
3720 

taactgacga cgctactggt cagaaatatc cgctggcaga ctatgcgctg actccggata 
3780 

tggcgattgt cgacgccaac ctggttatgg acatgccgaa gtccctgtgt gctttcggtg 
3840 

gtctggacgc agtaactcac gccatggaag cttatgtttc tgtactggca tctgagttct 
3900 

ctgatggtca ggctctgcag gcactgaaac tgctgaaaga atatctgcca gcgtcctacc 
3960 
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acgaagggtc taaaaatccg gtagcgcgtg aacgtgttca cagtgcagcg actatcgcgg 
4020 

gtatcgcgtt tgcgaacgcc ttcctgggtg tatgtcactc aatggcgcac aaactgggtt 
4080 

cccagttcca tattccgcac ggtctggcaa acgccctgct gatttgtaac gttattcgct 
4140 

acaatgcgaa cgacaacccg accaagcaga ctgcattcag ccagtatgac cgtccgcagg 
4200 

ctcgccgtcg ttatgctgaa attgccgacc acttgggtct gagcgcaccg ggcgaccgta 
4260 

ctgctgctaa gatcgagaaa ctgctggcat ggctggaaac gctgaaagct gaactgggta 
4320 

ttccgaaatc tatccgtgaa gctggcgttc aggaagcaga cttcctggcg aacgtggata 
4380 

aactgtctga agatgcattc gatgaccagt gcaccggcgc taacccgcgt tacccgctga 
4440 

tctccgagct gaaacagatt ctgctggata cctactacgg tcgtgattat gtagaaggtg 
4500 

aaactgcagc gaagaaagaa gctgctccgg ctaaagctga gaaaaaagcg aaaaaatccg 
4560 

cttaacgaag agcaattaag ggcgaattcg tggatcctcg tcggttcaaa aaatgcctat 
4620 

agctaaatcc ggaacgacac tttagaggtt tctgggtcat cctgattcag acatagtgtt 
4680 

ttgaatatat ggagtaagca atgagttata ctgtcggtac ctatttagcg gcgcttgtcc 
4740 

agattggtct caagcatcac ttcgcagtcg cgggcgacta caacctcgtc cttcttgaca 
4800 
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acctgctttt gaacaaaaac atggagcagg 
4860 

tcagtgcaga aggttatgct cgtgccaaag 
4920 

tcggtgcgct ttccgcattt gatgctatcg 
4980 

tcctgatctc cggtgctccg aacaacaatg 
5040 

ctcttggcaa aaccgactat cactatcagt 
5100 

ctgaagcgat ttacacccca gaagaagctc 
5160 

ctcttcgtga gaagaagccg gtttatctcg 
5220 

gcgccgctcc tggaccggca agcgcattgt 
5280 

tgaatgcagc ggttgaagaa accctgaaat 
5340 

tcgtcggcag caagctgcgc gcagctggtg 
5400 

ctctcggtgg cgcagttgct accatggctg 
5460 

cattacatcg gtacctcatg ggtgaagtca 
5520 

aagccgatgc ggttatcgct ctggctcctg 
5580 

cggatattcc tgatcctaag aaactggttc 
5640 
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tttattgctg taacgaactg aactgcggtt 
cggacgcagc agccgtcgtt acctacagcg 
gtggcgccta tgcagaaaac cttccggtta 
atcacgctgc tggtcacgtg ttgcatcacg 
tggaaatggc caagaacatc acggccgcag 
cggctaaaat cgatcacgtg attaaaactg 
aaatcgcttg caacattgct tccatgccct 
tcaatgacga agccagcgac gaagcttctt 
tcatcgccaa ccgcgacaaa gttgccgtcc 
ctgaagaagc tgctgtcaaa tttgctgatg 
ctgcaaaaag cttcttccag aagaaaaccg 
gctatccggg cgttgaaaag acgatgaaag 
tcttcaacga ctactccacc actggttgga 
tcgctgaacc gcgttctgtc gtcgttaacg 
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gcgttcgctt ccccagcgtt catctgaaag actatctgac ccgtttggct cagaaagttt 
5700 

ccaagaaaac cggtgctttg gacttcttca aatccctcaa tgcaggtgaa ctgaagaaag 
5760 

ccgctccggc tgatccgagt gctccgttgg tcaacgcaga aatcgcccgt caggtcgaag 
5820 

ctcttctgac cccgaacacg acggttattg ctgaaaccgg tgactcttgg ttcaatgctc 
5880 

agcgcatgaa gctcccgaac ggtgctcgcg ttgaatatga aatgcagtgg ggtcacatcg 
5940 

gttggtccgt tcctgccgcc ttcggttatg ccgtcggtgc tccggaacgt cgcaacatcc 
6000 

tcatggttgg tgatggttcc ttccagctga cggctcagga agtcgctcag atggttcgcc 
6060 

tgaaactgcc ggttatcatc ttcttgatca ataactatgg ttacaccatc gaagttatga 
6120 

tccatgatgg tccgtacaac aacatcaaga actgggatta tgccggtctg atggaagtgt 
6180 

tcaacggtaa cggtggttat gacagcggcg ctggtaaagg cctgaaggct aaaaccggtg 
6240 

gcgaactggc agaagctatc aaggttgctc tggcaaacac cgacggccca accctgatcg 
6300 

aatgcttcat cggtcgtgaa gactgcactg aagaattggt caaatggggt aagcgcgttg 
6360 

ctgcccgcca acagccgtaa gcctgttaac aagctcctct agtttttaaa acgggattaa 
6420 

aacgcaaaaa caatagaaag cgatttctcg aaaatggttg ttttcgggtt gttgctttaa 
6480 
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actagtatgt agggtgaggt tatagctatg gcttcttcaa ctttttatat tcctttcgtc 
6540 

aacgaaatgg gcgaaggttc gcttgaaaaa gcaatcaagg atcttaacgg cagcggcttt 
6600 

aaaaatgcgc tgatcgtttc tgatgctttc atgaacaaat ccggtgttgt gaagcaggtt 
6660 

gctgacctgt tgaaagcaca gggtattaat tctgctgttt atgatggcgt tatgccgaac 
6720 

ccgactgtta ccgcagttct ggaaggcctt aagatcctga aggataacaa ttcagacttc 
6780 

gtcatctccc tcggtggtgg ttctccccat gactgcgcca aagccatcgc tctggtcgca 
6840 

accaatggtg gtgaagtcaa agactacgaa ggtatcgaca aatctaagaa acctgccctg 
6900 

cctttgatgt caatcaacac gacggctggt acggcttctg aaatgacgcg tttctgcatc 
6960 

atcactgatg aagtccgtca cgttaagatg gccattgttg accgtcacgt taccccgatg 
7020 

gtttccgtca acgatcctct gttgatggtt ggtatgccaa aaggcctgac cgccgccacc 
7080 

ggtatggatg ctctgaccca cgcatttgaa gcttattctt caacggcagc tactccgatc 
7140 

accgatgctt gcgctttgaa agcagcttcc atgatcgcta agaatctgaa gaccgcttgc 
7200 

gacaacggta aggatatgcc ggctcgtgaa gctatggctt atgcccaatt cctcgctggt 
7260 

atggccttca acaacgcttc gcttggttat gtccatgcta tggctcacca gttgggcggt 
7320 
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tactacaacc tgccgcatgg tgtctgcaac gctgttctgc ttccgcatgt tctggcttat 
7380 

aacgcctctg tcgttgctgg tcgtctgaaa gacgttggtg ttgctatggg tctcgatatc 
7440 

gccaatctcg gtgataaaga aggcgcagaa gccaccattc aggctgttcg cgatctggct 
7500 

gctcccattg gtattccagc aaacctgacc gagctgggtg ctaagaaaga agatgtgccg 
7560 

cttcttgctg accacgctct gaaagatgct tgtgctctga ccaacccgcg tcagggtgat 
7620 

cagaaagaag ttgaagaact cttcctgagc gctttctaat ttcaaaacag gaaaacggtt 
7680 

ttccgtcctg tcttgatttt caagcaaaca atgcctccga tttctaatcg gaggcatttg 
7740 

tttttgttta ttgcaaaaac aaaaaatatt gttacaaatt tttacaggct attaagccta 
7800 

ccgtcataaa taatttgcca tttaaagcct attatcagga ttttcgcccc gatttcagcc 
7860 

atgtatcgct catgatcgcg acatgttctg atattttcct ctaaaaaaga taaaaagtct 
7920 

tttcgcttcg gcagaagagg ttcatcatga acaaaaattc ggcatttttg atgtaaccca 
7980 

ctcgtgcacc caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa 
8040 

aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac 
8100 

tcatactctt cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg 
8160 
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gatacatatt tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc 
8220 

gaaaagtgcc acctgacgtc taagaaacca ttattatcat gacattaacc tataaaaata 
8280 

ggcgtatcac gaggcccttt cgtcttcgaa taaatacctg tgacggaaga tcacttcgca 
8340 

gaataaataa atcctggtgt ccctgttgat accgggaagc cctgggccaa cttttggcga 
8400 

aaatgagacg ttgatcggca cgtaagaggt tccaactttc accataatga aataagatca 
8460 

ctaccgggcg tattttttga gttatcgaga ttttcaggag ctaaggaagc taaaatggag 
8520 

aaaaaaatca ctggatatac caccgttgat atatcccaat ggcatcgtaa agaacatttt 
8580 

gaggcatttc agtcagttgc tcaatgtacc tataaccaga ccgttcagct ggatattacg 
8640 

gcctttttaa agaccgtaaa gaaaaataag cacaagtttt atccggcctt tattcacatt 
8700 

cttgcccgcc tgatgaatgc tcatccggaa ttccgtatgg caatgaaaga cggtgagctg 
8760 

gtgatatggg atagtgttca cccttgttac accgttttcc atgagcaaac tgaaacgttt 
8820 

tcatcgctct ggagtgaata ccacgacgat ttccggcagt ttctacacat atattcgcaa 
8880 

gatgtggcgt gttacggtga aaacctggcc tatttcccta aagggtttat tgagaatatg 
8940 

tttttcgtct cagccaatcc ctgggtgagt ttcaccagtt ttgatttaaa cgtggccaat 
9000 
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atggacaact tcttcgcccc cgttttcacc atgggcaaat attatacgca aggcgacaag 
9060 

gtgctgatgc cgctggcgat tcaggttcat catgccgttt gtgatggctt ccatgtcggc 
9120 

agaatgctta atgaattaca acagtactgc gatgagtggc agggcggggc gtaatttttt 
9180 

taaggcagtt attggtgccc ttaaacgcct ggtgctacgc ctgaataagt gataataagc 
9240 

ggatgaatgg cagaaattcg aaagcaaatt cgacccggat ccagatatcc tgcagagaag 
9300 

cttggcgcca gccggcttca attgcacggg cgcgccaagc tcggaattcc ggtctcccta 
9360 

tagtgagtcg tattaatttc gataagccag atcaagcttg tcgacaattc gaagttccta 
9420 

tactttctag agaataggaa cttcggatca attcatcgat 
9460 
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